CN102520906A - Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length - Google Patents
Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length Download PDFInfo
- Publication number
- CN102520906A CN102520906A CN2011104130015A CN201110413001A CN102520906A CN 102520906 A CN102520906 A CN 102520906A CN 2011104130015 A CN2011104130015 A CN 2011104130015A CN 201110413001 A CN201110413001 A CN 201110413001A CN 102520906 A CN102520906 A CN 102520906A
- Authority
- CN
- China
- Prior art keywords
- point
- floating
- mantissa
- result
- index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Complex Calculations (AREA)
Abstract
Description
技术领域 technical field
本发明涉及高性能数字信号处理器技术领域,尤其涉及一种支持定浮点可重构的向量长度可配置的向量点积累加网络。The invention relates to the technical field of high-performance digital signal processors, in particular to a vector point accumulation and addition network with configurable vector length supporting fixed-floating point reconfiguration.
背景技术 Background technique
在现代数字信号处理领域中,数字信号处理器(DSP)是整个系统的核心,DSP的性能直接决定整个系统的性能。在DSP中,无论多么复杂的运算都要交由运算单元来实现,因此运算单元是整个DSP的核心部件,其计算能力是衡量DSP性能的主要指标。特别地,随着技术的不断发展,以现代雷达信号处理、星载卫星图像处理、图像压缩、高清视频等为代表的高计算密集型领域,对信号处理的能力越来越高。这对运算单元提出了越来越高的挑战,特别是针对特定领域的可变尺寸、高密度并行计算给运算单元带来的压力不言而喻。In the field of modern digital signal processing, digital signal processor (DSP) is the core of the whole system, and the performance of DSP directly determines the performance of the whole system. In DSP, no matter how complicated the calculation is, it must be implemented by the calculation unit, so the calculation unit is the core component of the entire DSP, and its computing power is the main indicator to measure the performance of the DSP. In particular, with the continuous development of technology, high computing-intensive fields represented by modern radar signal processing, satellite image processing, image compression, high-definition video, etc., have higher and higher signal processing capabilities. This poses increasingly higher challenges to the computing unit, especially the variable-size, high-density parallel computing for specific fields puts pressure on the computing unit is self-evident.
在现代数字信号处理中,大量使用了“点积”操作,如FFT、FIR滤波、信号相关等。所有这些操作都是将输入信号与一个系数或者本地参数相乘后积分(累加),这就表现为将两个向量(或称序列)进行点积。对于点积运算,目前主流DSP还没有专用的指令,一般是通过乘法、累加或乘累加多条指令完成的。该技术普遍存在一些缺点:In modern digital signal processing, "dot product" operations are widely used, such as FFT, FIR filtering, signal correlation, etc. All these operations are to multiply the input signal with a coefficient or local parameter and then integrate (accumulate), which is expressed as a dot product of two vectors (or sequences). For the dot product operation, the current mainstream DSP does not have a dedicated instruction, and it is generally completed by multiplying, accumulating, or multiplying and accumulating multiple instructions. This technique generally has some disadvantages:
1)硬件利用率低。对于向量点积操作,仅仅利用了标量乘法、累加或标量乘累加资源,没有合理地利用向量运算资源。1) The hardware utilization rate is low. For the vector dot product operation, only scalar multiplication, accumulation or scalar multiplication and accumulation resources are used, and vector operation resources are not reasonably utilized.
2)处理能力弱。一般只能执行16/32位标量点积操作,对于向量点积只能由多个标量点积操作完成,数据吞吐能力低,效率低下。2) The processing ability is weak. Generally, only 16/32-bit scalar dot product operations can be performed. For vector dot products, only multiple scalar dot product operations can be performed. The data throughput is low and the efficiency is low.
3)执行周期长。一条点积运算需要多条乘法、累加或乘累加指令串行完成,存在数据相关,点积运算需要多个时钟周期。3) The execution cycle is long. A dot product operation requires multiple multiplication, accumulation or multiply-accumulate instructions to be completed serially, there is data correlation, and the dot product operation requires multiple clock cycles.
4)灵活度不高。仅仅能支持某种特定的数据格式或特定长度的点积运算,且参与点积运算的数据个数不可配置。4) The flexibility is not high. It can only support a specific data format or a specific length of dot product operation, and the number of data involved in the dot product operation is not configurable.
5)编程困难。点积操作由乘法、累加或乘累加操作完成,而这些操作都不是单周期指令,需要考虑各条指令间的数据相关度。5) Programming is difficult. The dot product operation is completed by multiplication, accumulation or multiply-accumulate operations, and these operations are not single-cycle instructions, and the data correlation between each instruction needs to be considered.
点积运算属于多操作数计算范畴,多操作数计算需重点分析数据的符号位扩展以及数据精度。目前已有一些专利讨论了如何实现多操作数运算,如申请号为201010162375.X的专利《一种支持定浮点可重构横向求和网络》提出了一种多操作数加法运算,但其没有分析浮点数据格式的精度并且整个数据长度不可配置。申请号为201010535666.9的专利《用于执行点积运算的指令和逻辑》提出了一种点积指令执行的思路,数据格式可配置,但其仍然是标量点积,且参与点积运算的数据个数不可配置。申请号为201010559300.5的专利《用于SIMD向量微处理器的多功能》介绍了一种向量浮点乘加单元实现的思路,但其没有对定点数据进行重构,且同样参与乘加运算的数据个数不可配置。The dot product operation belongs to the category of multi-operand calculation, and the multi-operand calculation needs to focus on analyzing the sign bit extension and data precision of the data. At present, some patents have discussed how to implement multi-operand operations. For example, the patent "A Reconfigurable Horizontal Summation Network Supporting Fixed-Floating Points" with application number 201010162375.X proposes a multi-operand addition operation, but its The precision of the floating point data format is not analyzed and the overall data length is not configurable. The patent application number 201010535666.9 "Instructions and Logic for Performing Dot Product Operations" proposes an idea for executing dot product instructions. The data format is configurable, but it is still a scalar dot product, and the data involved in the dot product operation The number is not configurable. Patent Application No. 201010559300.5 "Multifunction for SIMD Vector Microprocessor" introduces an idea of implementing a vector floating-point multiplication and addition unit, but it does not reconstruct fixed-point data and also participates in multiplication and addition operations. The number is not configurable.
在算术级分析点积运算的依赖性,可以知道点积运算由乘法和累加操作完成;因此,改变传统方法中利用标量运算部件执行向量点积的思路,复用已有的向量乘法资源,增加累加网络资源,利用向量运算部件来执行向量点积操作,能够从根本上提高向量点积运算的能力。同时,分析浮点数据和定点数据格式的相关性,对浮点数据进行预处理,复用定点数据通路;通过可重构压缩技术支持不同的数据格式和不同的数据粒度;采用Mask寄存器配置参与点积运算的数据个数,从而提供一种支持不同粒度、不同数据格式、不同向量长度的向量点积累加网络,以提供数字信号处理中强大的向量点积运算能力,是本发明需要解决的问题。Analyzing the dependence of the dot product operation at the arithmetic level, we can know that the dot product operation is completed by multiplication and accumulation operations; therefore, the idea of using scalar operation components to perform vector dot products in the traditional method is changed, and the existing vector multiplication resources are reused to increase Accumulating network resources and using vector operation components to perform vector dot product operations can fundamentally improve the capability of vector dot product operations. At the same time, analyze the correlation between floating-point data and fixed-point data format, preprocess floating-point data, and multiplex fixed-point data paths; support different data formats and different data granularities through reconfigurable compression technology; use Mask register configuration to participate The number of data of the dot product operation, thereby providing a kind of vector point accumulation network that supports different granularities, different data formats, and different vector lengths, to provide powerful vector dot product computing capabilities in digital signal processing, is the problem that the present invention needs to solve question.
发明内容 Contents of the invention
(一)要解决的技术问题(1) Technical problems to be solved
有鉴于此,本发明的主要目的在于提供一种支持定浮点可重构、向量长度可配置的向量点积累加网络,能够支持8/16/32位定点数据、32位精简IEEE-754标准单精度浮点数据点积运算,可灵活配置参与点积运算的向量长度,合理利用向量运算资源,以提高向量点积运算的效率和处理能力,简化向量点积运算软件编程复杂度,满足数字信号处理中强大的向量点积运算需求。In view of this, the main purpose of the present invention is to provide a vector point accumulation and addition network that supports fixed-floating point reconfigurability and configurable vector length, can support 8/16/32-bit fixed-point data, and 32-bit simplified IEEE-754 standard Single-precision floating-point data dot product operation can flexibly configure the length of the vector involved in the dot product operation, and rationally use vector operation resources to improve the efficiency and processing capacity of the vector dot product operation, simplify the programming complexity of the vector dot product operation software, and meet the needs of digital signals Handle the powerful vector dot product operation requirements.
(二)技术方案(2) Technical solutions
为达到上述目的,本发明提供了一种支持定浮点可重构的向量长度可配置的向量点积累加网络,包括:并行可重构乘法器1,用于接收向量数据B、C和数据选项FBS、U作为输入,执行向量乘法操作,得到向量数据B、C的乘法结果B×C,并输出给浮点指数、尾数预处理部分2;浮点指数、尾数预处理部分2,用于接收并行可重构乘法器1的乘法结果B×C和标量数据A作为输入,完成选择浮点指数最大值、指数求差、移位对齐、补码转换和sticky位补偿操作,得到处理后的向量结果B×C和标量结果A,并输出给可重构压缩器部分3;可重构压缩器部分3,用于接收浮点指数、尾数预处理部分2的处理结果,并对其进行压缩,得到“和串”(S)和“进位串”(C),并输出给浮点指数、尾数后处理/定点操作部分4;以及浮点指数、尾数后处理/定点操作部分4,用于对接收自可重构压缩器部分3的“和串”S和“进位串”C并进行尾数相加,对相加的尾数结果进行后处理得到最终的向量点积累加结果。In order to achieve the above object, the present invention provides a vector point accumulation network with configurable vector length that supports fixed-floating point reconfigurability, including: parallel
(三)有益效果(3) Beneficial effects
本发明提供的这种支持定浮点可重构的向量长度可配置的向量点积累加网络,对浮点数据进行预处理复用定点数据通路,通过可重构压缩技术支持不同的数据格式和数据粒度,采用Mask寄存器灵活配置参与点积运算的向量长度,能支持8/16/32位定点数据、精简的IEEE-754标准单精度浮点数据的向量点积运算,可灵活配置参与点积运算的向量长度,运算性能高、开销小、功能多、编码少、速度快,降低了浮点向量点积累加关键路径的延时,减少了定点向量点积累加所消耗的资源,简化了软件编程的复杂度,提高了代码密度。The vector point accumulation and addition network with configurable fixed-floating-point reconfigurable vector length provided by the present invention preprocesses floating-point data and multiplexes fixed-point data paths, and supports different data formats and formats through reconfigurable compression technology. Data granularity, use the Mask register to flexibly configure the length of the vector involved in the dot product operation, can support 8/16/32-bit fixed-point data, simplified IEEE-754 standard single-precision floating-point data vector dot product operation, can be flexibly configured to participate in the dot product The vector length of the operation has high operation performance, low overhead, many functions, less coding, and fast speed, which reduces the delay of the floating-point vector point accumulation and critical path, reduces the resources consumed by the fixed-point vector point accumulation, and simplifies the software. The complexity of programming increases the code density.
附图说明 Description of drawings
图1是依照本发明实施例的支持定浮点可重构的向量长度可配置的向量点积累加网络的结构示意图;FIG. 1 is a schematic structural diagram of a vector point accumulation and addition network that supports fixed-floating-point reconfigurable vector lengths and can be configured according to an embodiment of the present invention;
图2是依照本发明实施例的8×8乘法器组构为16×16乘法器的示意图;2 is a schematic diagram of an 8×8 multiplier configured as a 16×16 multiplier according to an embodiment of the present invention;
图3是依照本发明实施例的8×8位乘法器组构为16×16乘法器的阵列图;FIG. 3 is an array diagram of an 8×8-bit multiplier configured as a 16×16 multiplier according to an embodiment of the present invention;
图4是依照本发明实施例的向量点积累加网络中浮点指数、尾数预处理部分,一次比较三个指数获取浮点较大指数网络的示意图;Fig. 4 is a schematic diagram of the floating-point exponent and mantissa preprocessing part in the vector point accumulation network according to an embodiment of the present invention, and comparing three exponents at a time to obtain a larger floating-point exponent network;
图5是依照本发明实施例的向量点积累加网络中浮点指数、尾数预处理部分,级联指数比较器网络获取浮点最大指数的示意图;5 is a schematic diagram of the floating-point exponent and mantissa preprocessing part in the vector point accumulation network according to an embodiment of the present invention, and the cascaded exponent comparator network obtaining the maximum floating-point exponent;
图6是依照本发明实施例的向量点积累加网络中浮点指数、尾数预处理部分,浮点尾数处理的示意图;6 is a schematic diagram of floating-point exponent and mantissa preprocessing part and floating-point mantissa processing in the vector point accumulation network according to an embodiment of the present invention;
图7是依照本发明实施例的向量点积累加网络中可重构压缩器部分,8/16/32位定点、32位浮点可重构压缩器网络的示意图;7 is a schematic diagram of the reconfigurable compressor part in the vector point accumulation network according to an embodiment of the present invention, 8/16/32-bit fixed-point, 32-bit floating-point reconfigurable compressor network;
图8是依照本发明实施例的向量点积累加网络中可重构压缩器部分,32位子压缩器的示意图;8 is a schematic diagram of a 32-bit sub-compressor in the reconfigurable compressor part of the vector point accumulation network according to an embodiment of the present invention;
图9是依照本发明实施例的向量点积累加网络中浮点指数、尾数后处理/定点操作数部分,并行指数修正单元的示意图。9 is a schematic diagram of the floating-point exponent, the mantissa post-processing/fixed-point operand part, and the parallel exponent correction unit in the vector point accumulation network according to an embodiment of the present invention.
具体实施方式 Detailed ways
为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明进一步详细说明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.
本发明的主要特点为:数据格式可重构、向量长度可配置。描述过程中约定以下说明符号:点积指令描述为D=A+B DOTC{(U)}{(M)}{(FBS)};其中,A、D为32位标量数据,B、C为512位向量数据,DOT表示点积操作符;Mask为64位寄存器,每位分别控制B×C结果的8位字节;M表示向量点积累加操作受Mask寄存器影响,当M选项不存在时表示Mask寄存器对向量点积累加操作无影响;U表示无符号选项;FBS表示数据格式,用2位二进制表示。“00”代表32位定点,“01”代表32位精简单精度浮点,“10”代表8位字节,“11”代表16位半字。The main features of the invention are: the data format can be reconfigured, and the vector length can be configured. The following description symbols are agreed during the description process: the dot product instruction is described as D=A+B DOTC{(U)}{(M)}{(FBS)}; where A and D are 32-bit scalar data, and B and C are 512-bit vector data, DOT means the dot product operator; Mask is a 64-bit register, each bit controls the 8-bit byte of the B×C result; M means that the vector point accumulation operation is affected by the Mask register, when the M option does not exist Indicates that the Mask register has no effect on the vector point accumulation operation; U indicates the unsigned option; FBS indicates the data format, expressed in 2-bit binary. "00" stands for 32-bit fixed point, "01" stands for 32-bit compact simple-precision floating point, "10" stands for 8-bit byte, and "11" stands for 16-bit halfword.
本发明实施例中假定B、C为512位向量数据,但本发明适用于任何B、C为32倍数位宽的场合,Mask寄存器的宽度与向量B、C的长度关系为LengthMask=LengthB/8。In the embodiment of the present invention, it is assumed that B and C are 512-bit vector data, but the present invention is applicable to any occasion where B and C are 32 multiples of bit width, and the width of the Mask register and the length relationship of vector B and C are Length Mask =Length B /8.
如图1所示,图1是依照本发明实施例的支持定浮点可重构的向量长度可配置的向量点积累加网络的结构示意图,该向量点积累加网络包括依次连接的并行可重构乘法器部分1,浮点指数、尾数预处理部分2,可重构压缩器部分3,浮点指数、尾数后处理/定点操作部分4。其中:As shown in Figure 1, Figure 1 is a schematic structural diagram of a vector point accumulation network that supports fixed-floating point reconfigurable vector lengths according to an embodiment of the present invention. The vector point accumulation network includes sequentially connected parallel
并行可重构乘法器1接收向量数据B、C和数据选项FBS、U作为输入,并根据相应的数据格式,执行向量乘法操作,得到向量B×C的结果,并输出给浮点指数、尾数预处理部分2;Parallel
浮点指数、尾数预处理部分2接收并行可重构乘法器1的乘法结果B×C和标量数据A作为输入,完成选择浮点指数最大值、指数求差、移位对齐、补码转换和sticky位补偿等操作,得到处理后的向量结果B×C和标量结果A,并将该处理结果输出给可重构压缩器部分3;The floating-point exponent and mantissa preprocessing
可重构压缩器部分3接收浮点指数、尾数预处理部分2的处理结果,并对其进行压缩,得到“和串”(S)和“进位串”(C),并输出给浮点指数、尾数后处理/定点操作部分4;The
浮点指数、尾数后处理/定点操作部分4接收可重构压缩器部分3的“和串”S和“进位串”C进行尾数相加,对于定点格式,直接进行结果处理得到定点向量点积累加结果;对于浮点格式进行前导1判断、规格化移位、规格化舍入、指数调整、符号调整等操作,最终得到浮点向量点积累加结果。The floating-point exponent and mantissa post-processing/fixed-point operation part 4 receives the "sum string" S and "carry string" C of the
下面结合图2至图9,详细介绍本发明提供的支持定浮点可重构的向量长度可配置的向量点积累加网络。本发明在具体实现方面,包括并行、级联、可重构和可配置设计。The vector point accumulation and addition network with configurable vector length supporting fixed-floating-point reconfigurable provided by the present invention will be described in detail below with reference to FIG. 2 to FIG. 9 . In terms of specific implementation, the present invention includes parallel, cascade, reconfigurable and configurable designs.
并行可重构乘法器1采用16个完全相同的32位可重构乘法器11,支持8/16/32位定点乘法,32位IEEE754标准精简单精度浮点乘法,16个32位乘法器并行工作,达到16×32位的吞吐量。该32位可重构乘法器由基本的8×8乘法器组构而成,可以完成8/16/32位有/无符号定点乘法以及32位精简单精度浮点乘法,得到512(=16×32)位乘法结果。高位宽的乘法器由较低位宽的乘法器组构而成,以8×8乘法器为基本单元,4个8×8乘法器按照一定的关系组构为1个16×16乘法器,4个16×16乘法器按照一定的关系组构为1个32×32乘法器。Parallel
如图2所示,图2是依照本发明实施例的8×8乘法器组构为16×16乘法器的示意图,其中16位乘法器的数学描述如下公式1所示:As shown in FIG. 2, FIG. 2 is a schematic diagram of an 8×8 multiplier configured as a 16×16 multiplier according to an embodiment of the present invention, wherein the mathematical description of the 16-bit multiplier is shown in
A×B=(AH×28+AL)×(BH×28+BL) (公式1)A×B=(AH×2 8 +AL)×(BH×2 8 +BL) (Formula 1)
=AH×BH×216+(AH×BL+AL×BH)×28+AL×BL=AH×BH×2 16 +(AH×BL+AL×BH)×2 8 +AL×BL
其中AH/BH、AL/BL分别为16位数据A/B的高、低8位。首先4个8×8乘法器并行工作,每个8×8乘法器通过华莱士压缩树得到2个压缩结果;然后4个乘法器的8个压缩结果权重对齐后一起进入1个8-2压缩器得到最终的“和串”S和“进位串”C;S和C通过一个24位加法器得到16×16乘法器的高24位,16×16乘法器的另外低8位直接来自AL×BL乘法器的低8位,同时AL×BL乘法器的低8位进位输出作为24位乘法器的进位输入。最后,通过1个选择器,当工作为8×8模式时,选择器输出AL×BL和AH×BL乘法器的16位结果;当工作在16×16模式时,选择器输出24位加法器的结果和AL×BL的低8位结果构成的16×16乘法器的32位结果。Among them, AH/BH and AL/BL are the high and low 8 bits of 16-bit data A/B respectively. First, 4 8×8 multipliers work in parallel, and each 8×8 multiplier obtains 2 compression results through the Wallace compression tree; then the 8 compression results of the 4 multipliers are weighted and then enter together into 1 8-2 The compressor gets the final "sum string" S and "carry string" C; S and C get the upper 24 bits of the 16×16 multiplier through a 24-bit adder, and the other low 8 bits of the 16×16 multiplier come directly from AL The lower 8 bits of the ×BL multiplier, and the lower 8-bit carry output of the AL×BL multiplier is used as the carry input of the 24-bit multiplier. Finally, through a selector, when working in 8×8 mode, the selector outputs 16-bit results of AL×BL and AH×BL multipliers; when working in 16×16 mode, the selector outputs 24-bit adder The result of the result and the low 8-bit result of AL×BL form the 32-bit result of the 16×16 multiplier.
图3给出了依照本发明实施例的8×8位乘法器组构为16×16乘法器的阵列图,同理4个16×16乘法器可以按照相同的方法组构成1个32×32乘法器。Fig. 3 has provided the array diagram that the 8 * 8 multipliers according to the embodiment of the present invention are organized into 16 * 16 multipliers, similarly four 16 * 16 multipliers can be combined to form a 32 * 32 multipliers according to the same method multiplier.
上面给出的是低位乘法器组构为高位乘法器的思想,但在本发明的乘法器的具体实现时全部采用默认饱和截断处理方式,即8×8乘法器的结果仅取低8位,当高位有结果时,低8位直接饱和处理取最大/小值。同理对于16×16乘法运算,结果取低16位;32×32乘法运算,结果取低32位。Provided above is the idea that the low-order multiplier is configured as a high-order multiplier, but when the multiplier of the present invention is implemented, all adopt the default saturation and truncation processing method, that is, the result of the 8×8 multiplier only takes the lower 8 bits, When the high bit has a result, the low 8 bits directly saturate and take the maximum/minimum value. Similarly, for 16×16 multiplication, the lower 16 bits are taken as the result; for 32×32 multiplication, the lower 32 bits are taken as the result.
如图1所示,所述浮点指数、尾数操作部分2包括指数级联比较器21、指数求差阵列22、移位对齐单元23、补码转换单元24和sticky位补偿单元25;所述指数级联比较器21用于获取17个浮点指数最大值Emax;所述指数求差阵列22用于获取各浮点指数Ei与浮点指数最大值Emax的差Emax-Ei,该指数差用于移位对齐单元23的移位距离;所述移位对齐单元23采用指数求差阵列22的输出Emax-Ei作为控制信号,执行右移操作,对浮点尾数进行对齐;所述补码转换单元24对特定的移位后浮点尾数进行求反操作,需要求反的尾数为符号位与最大浮点指数符号位不同的浮点尾数;所述sticky位补偿单元25对移出去的浮点尾数和需要求补码的浮点尾数进行一位补偿,得到17位补偿单元。As shown in Figure 1, the floating-point exponent and the
浮点指数、尾数预处理部分2从并行可重构乘法器部分1中获取16个浮点乘法结果,并依次分离出浮点的指数E0-E15,尾数M0-M15和符号位S0-S15。16个浮点乘法结果指数E0-E15和标量寄存器中的浮点指数E16进入级联指数比较器21获得最大浮点指数Emax,然后通过指数求差阵列22依次求出个浮点指数Ei和指数最大值Emax的差值ΔEi,指数求差阵列为17个并行的8位减法器。移位对齐单元23采用17个32位并行移位器同时进行,每个移位器的控制信号来自指数求差阵列22的输出ΔEi,移位器的输出给补码转换单元24,当时需要对相应移位后的浮点尾数进行求反操作;同时对移出去的ΔEi位尾数进行Sticky补偿,当(尾数需要求补)或者移出去的二进制位中含有1时需要进行Sticky补偿,即Stickyi=1。Sticky位补偿单元25统计出Sticky0-Sticky16中需要进行Sticky位补偿的个数Comp。整个浮点指数、尾数预处理部分输出512位向量尾数、32位标量尾数和5位Sticky补偿位Comp。The floating-point exponent and
当工作在定点模式时,浮点指数、尾数预处理部分2通过另外一条路径直接输出定点结果,定点数据在此部分不需要经过任何处理。When working in the fixed-point mode, the floating-point exponent and
如图4所示,图4是依照本发明实施例的向量点积累加网络中浮点指数、尾数预处理部分,一次比较三个指数获取浮点较大指数网络的示意图。为了减少关键路径延时,在进行浮点指数级联比较时采用每次比较3个数,3个8位比较器并行工作,分别产生相应的标志位,然后根据各自标志位选择输出3个数中的最大值。As shown in FIG. 4 , FIG. 4 is a schematic diagram of a floating-point exponent and a mantissa preprocessing part in the vector point accumulation network according to an embodiment of the present invention, and a comparison of three exponents at a time to obtain a larger floating-point exponent network. In order to reduce the delay of the critical path, when performing cascade comparison of floating-point indices, three numbers are compared each time, and three 8-bit comparators work in parallel to generate corresponding flag bits, and then select and
如图5所示,图5是依照本发明实施例的向量点积累加网络中浮点指数、尾数预处理部分,级联指数比较器网络获取浮点最大指数的示意图。E0-E16分别输入第一级比较器阵列,得到6个“较大值”;然后这6个“较大值”依次进入第二、三级比较器阵列,最终得到浮点指数最大值Emax。这样级联比较器由原来的级减少为级,减少了2级8位比较器的延时,增加了一些相应的控制逻辑,但控制逻辑延时大大小于8位比较器,总体上减少了级联指数比较器的延时。As shown in FIG. 5, FIG. 5 is a schematic diagram of the floating-point exponent and mantissa preprocessing part in the vector point accumulation network according to the embodiment of the present invention, and the cascaded exponent comparator network to obtain the maximum floating-point exponent. E 0 -E 16 are respectively input into the first-stage comparator array to obtain 6 "larger values"; then these 6 "larger values" enter the second and third-stage comparator arrays in turn, and finally obtain the maximum value of the floating-point index E max . In this way the cascaded comparator consists of the original level reduced to stage, the delay of the 2-stage 8-bit comparator is reduced, and some corresponding control logic is added, but the delay of the control logic is much smaller than that of the 8-bit comparator, and the delay of the cascaded exponent comparator is generally reduced.
如图6所示,图6是依照本发明实施例的向量点积累加网络中浮点指数、尾数预处理部分,浮点尾数处理的示意图。为了保持浮点点积计算的精度满足IEEE754标准,对浮点尾数进行7位尾数扩展,24位尾数左移7位;同时为了使浮点尾数复用定点尾数压缩通路,在浮点尾数最高位补0,这样24位浮点尾数扩展成32位。在移位对齐时32位尾数右移ΔEi位,保存移出的ΔEi位;当时32位尾数需要进行求补操作;Sticky位补偿单元接收和右移出的ΔEi位作为控制信号,当或者对右移出的ΔEi位规约或操作为1时,进行补码补偿,即Stickyi=1,否则Stickyi=0。As shown in FIG. 6 , FIG. 6 is a schematic diagram of floating-point exponent and mantissa preprocessing part and floating-point mantissa processing in the vector point accumulation network according to an embodiment of the present invention. In order to keep the accuracy of the floating-point dot product calculation to meet the IEEE754 standard, a 7-bit mantissa extension is performed on the floating-point mantissa, and the 24-bit mantissa is left-shifted by 7 bits; 0, so that the 24-bit floating-point mantissa is extended to 32 bits. When shifting and aligning, the 32-bit mantissa is shifted to the right by ΔE i bits, and the shifted ΔE i bits are saved; when When the 32-bit mantissa needs to be complemented; the Sticky bit compensation unit receives and the right-shifted ΔE i bit as a control signal, when Alternatively, when the bits of ΔE i shifted out from the right are reduced or operated to be 1, complementary code compensation is performed, that is, Sticky i =1, otherwise Sticky i =0.
如图1所示,可重构压缩器部分3完成8/16/32位定点和32位浮点尾数压缩,包括Mask屏蔽单元31、符号位扩展单元32和可重构压缩器网络33。其中,Mask屏蔽单元31接收浮点指数、尾数预处理部分的输出,并分析Mask寄存器的值,Mask寄存器控制向量寄存器是否参与点积操作。当M选项有效时,只有Mask寄存器为1相应指示的值才进入压缩器网络;当M选项无效时,Mask寄存器对点积操作没有影响。Mask寄存器64位,每位分别指示向量寄存器的一个字节。标量寄存器的值和Sticky位补偿Comp不受Mask寄存器的影响。数据经Mask寄存器屏蔽后进入符号位扩展单元32,512位向量寄存器中每8位字节为1个独立的单元,进行8位符号位扩展。当为无符号定点点积操作时(U选项有效),直接在高位补8个0;当为有符号定点点积操作(U选项无效)或浮点点积操作时,在高位补偿8位符号位。经符号位扩展后,向量尾数、标量尾数、Comp补偿单元一起进入可重构压缩器网络33,可重构压缩器网络33支持8/16/32位有/无符号数据压缩,将16/32/64个32/16/8位数据、32位标量数据和Comp补偿单元压缩为“和串”S和“进位串”C。As shown in FIG. 1 , the
如图7所示,图7是依照本发明实施例的向量点积累加网络中可重构压缩器部分,8/16/32位定点、32位浮点可重构压缩器网络的示意图。512位向量尾数经符号位扩展后进入第一级压缩器阵列,经3层32位压缩器压缩后得到2个压缩结果。当工作在浮点模式时,第3层32位压缩器的2个输出、标量寄存器的值以及Sticky位补偿Comp进入一个4-2压缩器,得到浮点尾数压缩结果;当工作在32位定点模式时,4-2压缩器的Sticky位补偿Comp为0,得到32位定点压缩结果。当为16位定点格式时,经3层32位压缩器后依次通过一个16位压缩器以及一个3-2压缩器同标量寄存器的值一起压缩,得到16位定点压缩结果。8位定格格式同16位定点格式类似。As shown in FIG. 7, FIG. 7 is a schematic diagram of the reconfigurable compressor part of the vector point accumulation network according to the embodiment of the present invention, 8/16/32-bit fixed-point, 32-bit floating-point reconfigurable compressor network. The 512-bit vector mantissa enters the first-stage compressor array after sign bit expansion, and is compressed by three layers of 32-bit compressors to obtain two compression results. When working in floating-point mode, the 2 outputs of the third layer 32-bit compressor, the value of the scalar register and the Sticky bit compensation Comp enter a 4-2 compressor to obtain the floating-point mantissa compression result; when working in 32-bit fixed-point mode, the sticky bit compensation Comp of the 4-2 compressor is 0, and the 32-bit fixed-point compression result is obtained. When it is a 16-bit fixed-point format, after three layers of 32-bit compressors, a 16-bit compressor and a 3-2 compressor are compressed together with the value of the scalar register to obtain a 16-bit fixed-point compression result. The 8-bit fixed-frame format is similar to the 16-bit fixed-point format.
如图8所示,图8是依照本发明实施例的向量点积累加网络中可重构压缩器部分,32位子压缩器的示意图。该图上部分为压缩器部分,下部分为符号位扩展部分。32位压缩器由4个8位压缩器和MUX串联组成,根据不同的数据格式MUX选择低位选择器的进位或0进入高位压缩器。若工作在8位模式,则3个MUX分别选择0输入;若工作在16位模式,则第1、3个MUX选择低位进位,第2个MUX选择0;若工作在32位模式,则3个MUX都选择低位进位。符号位扩展部分的4个8位压缩器分别与压缩器部分的4个8位压缩器相连,并接收扩展符号位的输入,保证不同粒度数据压缩时算术等价。As shown in FIG. 8 , FIG. 8 is a schematic diagram of a 32-bit sub-compressor in the reconfigurable compressor part of the vector point-accumulation-add network according to an embodiment of the present invention. The upper part of the figure is the compressor part, and the lower part is the sign bit extension part. The 32-bit compressor is composed of four 8-bit compressors and MUX connected in series. According to different data formats, MUX selects the carry or 0 of the low-bit selector to enter the high-bit compressor. If working in 8-bit mode, the 3 MUXs select 0 input respectively; if working in 16-bit mode, the 1st and 3rd MUX select low-order carry, and the 2nd MUX selects 0; if working in 32-bit mode, 3 Each MUX selects the low bit carry. The four 8-bit compressors in the sign bit extension part are respectively connected to the four 8-bit compressors in the compressor part, and receive the input of the extended sign bit, so as to ensure arithmetic equivalence when compressing data with different granularities.
如图1所示,所述浮点指数、尾数后处理/定点数操作部分4,包括尾数相加单元41、前导0预测PZD 42、浮点尾数规格化移位单元43、浮点尾数规格化舍入单元44、指数修正单元45、符号位修正46和定点结果处理47;所述尾数相加单元41对压缩后的S、C串进行相加,得到尾数相加结果;所述前导0预测PZD 42对进入尾数相加单元41的S、C串进行0、1预编码,此0、1编码串经前导0判断电路处理得到前导1的位置,控制尾数计算结果在规格化移位时移位的距离,由于预编码可能会产生1位误差,移位后的结果还要经过补偿电路判断并纠正误差;所述浮点尾数规格化移位单元43采用前导0预测PZD 42得到的移位距离对结果尾数进行移位,得到规格化的浮点尾数结果;所述浮点尾数规格化舍入单元44根据Guard、Round、Sticky位对规格化的尾数进行舍入;所述指数修正单元45和符号修正单元46根据PZD的输出、规格化舍入的情况以及尾数相加结果进行指数调整和符号位修正,得到最终的浮点指数和符号位;所述定点结果处理单元47根据定点指令选项,对尾数相加单元41的结果进行处理,得到定点向量点积累加结果。As shown in Fig. 1, described floating-point exponent, mantissa post-processing/fixed-point number operation part 4 include
浮点指数、尾数后处理/定点操作数部分4利用快速复合加法器分别计算S+C和的值。当工作在定点格式时,定点结果处理47选择S+C结果,并进行后处理,得到向量定点点积累加结果。当工作在浮点格式时,根据S+C的结果符号位选择S+C或的值,并控制符号位修正单元46完成浮点符号位修正。当S+C的符号位为负,选择的值并对Smax取反作为最终结果的符号位;否则选择S+C的值,且最终结果的符号位与Smax一样。前导0预测PZD 42对进入尾数相加单元41的S、C串进行二进制预编码,此二进制串经前导0判断电路处理得到前导1的位置和规格化移位距离Dnormal,控制浮点尾数规格化移位单元43规格化左移或右移的位数;由于预编码可能会产生一位误差,移位后的结果还要经过补偿电路判断并纠正误差。规格化移位完成后,浮点尾数规格化舍入单元44根据移位结果进行Guard/Round/Sticky位判断,完成舍入,在舍入的同时判断尾数结果是否需要进行二次舍入(SecondRound);当舍入引起浮点尾数最高位产生进位时需要二次舍入,二次舍入会影响浮点的结果指数。指数修正单元45根据浮点最大指数Emax、规格化移位距离Dnormal和是否二次舍入(SecondRound)完成浮点结果指数调整。Floating-point exponent, mantissa post-processing/fixed-point operand Part 4 utilizes fast composite adder to calculate S+C and value. When working in a fixed-point format, the fixed-point result processing 47 selects the S+C result and performs post-processing to obtain a vector fixed-point accumulation result. When working in floating-point format, select S+C or and control the sign
如图9所示,图9是依照本发明实施例的向量点积累加网络中浮点指数、尾数后处理/定点操作数部分,并行指数修正单元的示意图,它与浮点尾数规格化移位单元43并行完成的。采用两个8位加法器分别计算Emax+Dnormal和Emax+Dnormal+1的值,然后通过二次舍入控制逻辑(SecondRound)选择最终的指数结果。当需要二次舍入时,一次舍入完成后尾数还需要左移一位,浮点结果尾数为Emax+Dnormal+1。采用指数并行调整的方法,原来指数调整路径上两个8位加法器减少为一个,减少了关键路径的延时,提高了整个系统的性能。As shown in Figure 9, Figure 9 is a schematic diagram of the parallel exponent correction unit in the floating-point exponent and mantissa post-processing/fixed-point operand part in the vector point accumulation network according to an embodiment of the present invention, which is normalized and shifted with the floating-
基于图1至图9所示的支持定浮点可重构的向量长度可配置的向量点积累加网络,本发明还提供了一种定浮点可重构、数据长度可配置的数据求和方法,该方法包括:8/16/32位定点数据可重构,8位定点数据为基本单元;2个8位定点数据和相应的控制逻辑组重构为16位定点数据;4个8位定点数据和相应的控制逻辑重构为32位定点数据;定浮点可重构,浮点尾数移位对齐后根据符号位情况决定是否完成求反操作;当符号位为1,浮点尾数求反,符号位保持1不变,浮点符号位、尾数形成新的32位数据;当符号位为0,浮点尾数保持保持不变。浮点尾数经符号位处理后可以复用定点数据通路;数据长度可配置,通过Mask寄存器实现;Mask寄存器的每一位分别控制向量数据的某个位域,通过配置Mask寄存器的值来配置参与运算的数据长度。Based on the vector point accumulation network with configurable fixed-floating point reconfigurable vector length shown in Figures 1 to 9, the present invention also provides a data summation with reconfigurable fixed-floating point and configurable data length The method includes: 8/16/32-bit fixed-point data can be reconfigured, and the 8-bit fixed-point data is the basic unit; two 8-bit fixed-point data and corresponding control logic groups are reconstructed into 16-bit fixed-point data; four 8-bit fixed-point data The fixed-point data and the corresponding control logic are reconstructed into 32-bit fixed-point data; the fixed-floating point can be reconfigured, and after the floating-point mantissa is shifted and aligned, it is determined whether to complete the negation operation according to the sign bit; when the sign bit is 1, the floating-point mantissa is calculated On the contrary, the sign bit remains 1, and the floating-point sign bit and mantissa form new 32-bit data; when the sign bit is 0, the floating-point mantissa remains unchanged. After the floating-point mantissa is processed by the sign bit, the fixed-point data path can be multiplexed; the data length is configurable and realized through the Mask register; each bit of the Mask register controls a certain bit field of the vector data, and the participation is configured by configuring the value of the Mask register Operation data length.
以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施例而已,并不用于限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The specific embodiments described above have further described the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above descriptions are only specific embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
Claims (13)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011104130015A CN102520906A (en) | 2011-12-13 | 2011-12-13 | Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011104130015A CN102520906A (en) | 2011-12-13 | 2011-12-13 | Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102520906A true CN102520906A (en) | 2012-06-27 |
Family
ID=46291849
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2011104130015A Pending CN102520906A (en) | 2011-12-13 | 2011-12-13 | Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102520906A (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104520807A (en) * | 2012-08-30 | 2015-04-15 | 高通股份有限公司 | Microarchitecture for floating point fused multiply-add with exponent scaling |
CN105354803A (en) * | 2015-10-23 | 2016-02-24 | 中国科学院上海高等研究院 | Truncated histogram equilibrium implementation apparatus and method |
CN106415483A (en) * | 2014-03-06 | 2017-02-15 | 甲骨文国际公司 | Floating point unit with support for variable length numbers |
CN106951211A (en) * | 2017-03-27 | 2017-07-14 | 南京大学 | A kind of restructural fixed and floating general purpose multipliers |
CN107305485A (en) * | 2016-04-25 | 2017-10-31 | 北京中科寒武纪科技有限公司 | It is a kind of to be used to perform the device and method that multiple floating numbers are added |
CN107636640A (en) * | 2016-01-30 | 2018-01-26 | 慧与发展有限责任合伙企业 | Dot product engine with designator of negating |
CN108345935A (en) * | 2017-01-25 | 2018-07-31 | 株式会社东芝 | Product and arithmetic unit, network element and network equipment |
CN108694037A (en) * | 2017-03-30 | 2018-10-23 | Arm有限公司 | Device and method for estimating shift amount when executing floating-point subtraction |
CN109062607A (en) * | 2017-10-30 | 2018-12-21 | 上海寒武纪信息科技有限公司 | Machine learning processor and the method for executing the instruction of vector minimum value using processor |
CN109582282A (en) * | 2017-09-29 | 2019-04-05 | 英特尔公司 | Tighten the multiplication for having value of symbol and cumulative systems, devices and methods for vector |
CN112019220A (en) * | 2020-08-21 | 2020-12-01 | 广东省新一代通信与网络创新研究院 | A block floating point data compression method and device based on difference offset detection |
CN112119407A (en) * | 2018-06-27 | 2020-12-22 | 国际商业机器公司 | Low precision deep neural network enabled by compensation instructions |
CN112130805A (en) * | 2020-09-22 | 2020-12-25 | 腾讯科技(深圳)有限公司 | Chip and device including floating-point adder, and control method for floating-point operation |
CN112148249A (en) * | 2020-09-18 | 2020-12-29 | 北京百度网讯科技有限公司 | Dot product operation implementation method and device, electronic equipment and storage medium |
CN112740171A (en) * | 2018-09-19 | 2021-04-30 | 赛灵思公司 | multiply and accumulate circuit |
CN112799719A (en) * | 2019-11-14 | 2021-05-14 | 南京风兴科技有限公司 | A kind of deep neural network training method and device |
CN112988112A (en) * | 2021-04-27 | 2021-06-18 | 北京壁仞科技开发有限公司 | Dot product calculating device |
CN113961875A (en) * | 2017-05-08 | 2022-01-21 | 辉达公司 | Generalized acceleration of matrix multiply-accumulate operations |
CN114021710A (en) * | 2021-10-27 | 2022-02-08 | 中国科学院计算技术研究所 | Deep learning convolution acceleration method and processor using bit-level sparsity |
CN116127255A (en) * | 2022-12-14 | 2023-05-16 | 北京登临科技有限公司 | Convolution operation circuit and related circuit or device with same |
US11990137B2 (en) | 2018-09-13 | 2024-05-21 | Shanghai Cambricon Information Technology Co., Ltd. | Image retouching method and terminal device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1633637A (en) * | 2001-10-05 | 2005-06-29 | 英特尔公司 | Multiply-accumulate (mac) unit for single-instruction/multiple-data (simd) instructions |
US20080071851A1 (en) * | 2006-09-20 | 2008-03-20 | Ronen Zohar | Instruction and logic for performing a dot-product operation |
CN101840324A (en) * | 2010-04-28 | 2010-09-22 | 中国科学院自动化研究所 | 64-bit fixed and floating point multiplier unit supporting complex operation and subword parallelism |
CN101847087A (en) * | 2010-04-28 | 2010-09-29 | 中国科学院自动化研究所 | Reconfigurable transverse summing network structure for supporting fixed and floating points |
-
2011
- 2011-12-13 CN CN2011104130015A patent/CN102520906A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1633637A (en) * | 2001-10-05 | 2005-06-29 | 英特尔公司 | Multiply-accumulate (mac) unit for single-instruction/multiple-data (simd) instructions |
US20080071851A1 (en) * | 2006-09-20 | 2008-03-20 | Ronen Zohar | Instruction and logic for performing a dot-product operation |
CN101840324A (en) * | 2010-04-28 | 2010-09-22 | 中国科学院自动化研究所 | 64-bit fixed and floating point multiplier unit supporting complex operation and subword parallelism |
CN101847087A (en) * | 2010-04-28 | 2010-09-29 | 中国科学院自动化研究所 | Reconfigurable transverse summing network structure for supporting fixed and floating points |
Non-Patent Citations (1)
Title |
---|
顾荣荣: "高性能可重构乘加单元设计", 《大众科技》, vol. 2010, no. 02, 1 March 2010 (2010-03-01) * |
Cited By (45)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104520807A (en) * | 2012-08-30 | 2015-04-15 | 高通股份有限公司 | Microarchitecture for floating point fused multiply-add with exponent scaling |
CN104520807B (en) * | 2012-08-30 | 2016-09-21 | 高通股份有限公司 | The micro-architecture of multiplication addition is merged for having the floating-point of index bi-directional scaling |
US9841948B2 (en) | 2012-08-30 | 2017-12-12 | Qualcomm Incorporated | Microarchitecture for floating point fused multiply-add with exponent scaling |
CN106415483A (en) * | 2014-03-06 | 2017-02-15 | 甲骨文国际公司 | Floating point unit with support for variable length numbers |
CN106415483B (en) * | 2014-03-06 | 2020-12-29 | 甲骨文国际公司 | Floating point unit with support for variable length numbers |
CN105354803A (en) * | 2015-10-23 | 2016-02-24 | 中国科学院上海高等研究院 | Truncated histogram equilibrium implementation apparatus and method |
CN107636640A (en) * | 2016-01-30 | 2018-01-26 | 慧与发展有限责任合伙企业 | Dot product engine with designator of negating |
CN107305485A (en) * | 2016-04-25 | 2017-10-31 | 北京中科寒武纪科技有限公司 | It is a kind of to be used to perform the device and method that multiple floating numbers are added |
CN107305485B (en) * | 2016-04-25 | 2021-06-08 | 中科寒武纪科技股份有限公司 | Apparatus and method for performing addition of multiple floating-point numbers |
CN108345935A (en) * | 2017-01-25 | 2018-07-31 | 株式会社东芝 | Product and arithmetic unit, network element and network equipment |
CN106951211A (en) * | 2017-03-27 | 2017-07-14 | 南京大学 | A kind of restructural fixed and floating general purpose multipliers |
CN106951211B (en) * | 2017-03-27 | 2019-10-18 | 南京大学 | A Reconfigurable Fixed-Floating-Point Universal Multiplier |
CN108694037B (en) * | 2017-03-30 | 2023-12-01 | Arm有限公司 | Apparatus and method for estimating shift amounts when performing floating point subtractions |
CN108694037A (en) * | 2017-03-30 | 2018-10-23 | Arm有限公司 | Device and method for estimating shift amount when executing floating-point subtraction |
CN113961875A (en) * | 2017-05-08 | 2022-01-21 | 辉达公司 | Generalized acceleration of matrix multiply-accumulate operations |
CN109582282A (en) * | 2017-09-29 | 2019-04-05 | 英特尔公司 | Tighten the multiplication for having value of symbol and cumulative systems, devices and methods for vector |
US11922132B2 (en) | 2017-10-30 | 2024-03-05 | Shanghai Cambricon Information Technology Co., Ltd. | Information processing method and terminal device |
CN109062607B (en) * | 2017-10-30 | 2021-09-21 | 上海寒武纪信息科技有限公司 | Machine learning processor and method for executing vector minimum instruction using the processor |
US12050887B2 (en) | 2017-10-30 | 2024-07-30 | Shanghai Cambricon Information Technology Co., Ltd. | Information processing method and terminal device |
CN109062607A (en) * | 2017-10-30 | 2018-12-21 | 上海寒武纪信息科技有限公司 | Machine learning processor and the method for executing the instruction of vector minimum value using processor |
US11762631B2 (en) | 2017-10-30 | 2023-09-19 | Shanghai Cambricon Information Technology Co., Ltd. | Information processing method and terminal device |
CN112119407B (en) * | 2018-06-27 | 2024-07-09 | 国际商业机器公司 | Low precision deep neural network enabled by compensation instructions |
CN112119407A (en) * | 2018-06-27 | 2020-12-22 | 国际商业机器公司 | Low precision deep neural network enabled by compensation instructions |
US12056594B2 (en) | 2018-06-27 | 2024-08-06 | International Business Machines Corporation | Low precision deep neural network enabled by compensation instructions |
US12094456B2 (en) | 2018-09-13 | 2024-09-17 | Shanghai Cambricon Information Technology Co., Ltd. | Information processing method and system |
US12057110B2 (en) | 2018-09-13 | 2024-08-06 | Shanghai Cambricon Information Technology Co., Ltd. | Voice recognition based on neural networks |
US12057109B2 (en) | 2018-09-13 | 2024-08-06 | Shanghai Cambricon Information Technology Co., Ltd. | Information processing method and terminal device |
US11996105B2 (en) | 2018-09-13 | 2024-05-28 | Shanghai Cambricon Information Technology Co., Ltd. | Information processing method and terminal device |
US11990137B2 (en) | 2018-09-13 | 2024-05-21 | Shanghai Cambricon Information Technology Co., Ltd. | Image retouching method and terminal device |
CN112740171A (en) * | 2018-09-19 | 2021-04-30 | 赛灵思公司 | multiply and accumulate circuit |
CN112740171B (en) * | 2018-09-19 | 2024-03-08 | 赛灵思公司 | Multiplication and accumulation circuit |
CN112799719A (en) * | 2019-11-14 | 2021-05-14 | 南京风兴科技有限公司 | A kind of deep neural network training method and device |
CN112799719B (en) * | 2019-11-14 | 2025-03-25 | 南京风兴科技有限公司 | A deep neural network training method and device |
CN112019220A (en) * | 2020-08-21 | 2020-12-01 | 广东省新一代通信与网络创新研究院 | A block floating point data compression method and device based on difference offset detection |
CN112148249A (en) * | 2020-09-18 | 2020-12-29 | 北京百度网讯科技有限公司 | Dot product operation implementation method and device, electronic equipment and storage medium |
CN112148249B (en) * | 2020-09-18 | 2023-08-18 | 北京百度网讯科技有限公司 | Dot product operation realization method and device, electronic equipment and storage medium |
JP2022552046A (en) * | 2020-09-18 | 2022-12-15 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | Dot product operation implementation method, device, electronic device, and storage medium |
WO2022057502A1 (en) * | 2020-09-18 | 2022-03-24 | 北京百度网讯科技有限公司 | Method and device for implementing dot product operation, electronic device, and storage medium |
CN112130805B (en) * | 2020-09-22 | 2024-05-24 | 腾讯科技(深圳)有限公司 | Chip comprising floating point adder, device and control method of floating point operation |
CN112130805A (en) * | 2020-09-22 | 2020-12-25 | 腾讯科技(深圳)有限公司 | Chip and device including floating-point adder, and control method for floating-point operation |
CN112988112B (en) * | 2021-04-27 | 2021-08-10 | 北京壁仞科技开发有限公司 | Dot product calculating device |
CN112988112A (en) * | 2021-04-27 | 2021-06-18 | 北京壁仞科技开发有限公司 | Dot product calculating device |
CN114021710A (en) * | 2021-10-27 | 2022-02-08 | 中国科学院计算技术研究所 | Deep learning convolution acceleration method and processor using bit-level sparsity |
CN116127255B (en) * | 2022-12-14 | 2023-10-03 | 北京登临科技有限公司 | Convolution operation circuit and related circuit or device with same |
CN116127255A (en) * | 2022-12-14 | 2023-05-16 | 北京登临科技有限公司 | Convolution operation circuit and related circuit or device with same |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102520906A (en) | Vector dot product accumulating network supporting reconfigurable fixed floating point and configurable vector length | |
CN101847087B (en) | A Reconfigurable Horizontal Sum Network Structure Supporting Fixed-Floating Point | |
CN110221808B (en) | Vector multiply-add operation preprocessing method, multiplier-adder and computer readable medium | |
CN102103479B (en) | Floating point calculator and processing method for floating point calculation | |
CN102520903B (en) | Length-configurable vector maximum/minimum network supporting reconfigurable fixed floating points | |
CN103176767B (en) | The implementation method of the floating number multiply-accumulate unit that a kind of low-power consumption height is handled up | |
CN104111816B (en) | Multifunctional SIMD structure floating point fusion multiplying and adding arithmetic device in GPDSP | |
CN107168678B (en) | A multiply-add computing device and floating-point multiply-add computing method | |
US5963461A (en) | Multiplication apparatus and methods which generate a shift amount by which the product of the significands is shifted for normalization or denormalization | |
CN106951211B (en) | A Reconfigurable Fixed-Floating-Point Universal Multiplier | |
CN101133389A (en) | Multipurpose multiply-add functional unit | |
CN102576302B (en) | Microprocessor and method for enhanced precision sum-of-products calculation on a microprocessor | |
CN105607889A (en) | Fixed-point and floating-point operation part with shared multiplier structure in GPDSP | |
CN101082860A (en) | Multiply adding up device | |
Wahba et al. | Area efficient and fast combined binary/decimal floating point fused multiply add unit | |
EP2828737A1 (en) | System and method for signal processing in digital signal processors | |
CN112860220A (en) | Reconfigurable floating-point multiply-add operation unit and method suitable for multi-precision calculation | |
CN102360281B (en) | Multifunctional fixed-point media access control (MAC) operation device for microprocessor | |
CN104991757A (en) | Floating point processing method and floating point processor | |
CN110688086A (en) | A Reconfigurable Integer-Floating-Point Adder | |
US8019805B1 (en) | Apparatus and method for multiple pass extended precision floating point multiplication | |
CN101840324B (en) | 64-bit fixed-floating-point multiplier that supports complex operations and subword parallelism | |
CN116661733A (en) | A Multiplier and Microprocessor Supporting Multiple Accuracy | |
CN116594590A (en) | Multifunctional operation device and method for floating point data | |
CN117648959B (en) | Multi-precision operand operation device supporting neural network operation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20120627 |