[go: up one dir, main page]

CN102799412A - CORDIC (coordinate rotation digital computer) accelerator based on parallel pipeline design - Google Patents

CORDIC (coordinate rotation digital computer) accelerator based on parallel pipeline design Download PDF

Info

Publication number
CN102799412A
CN102799412A CN2012102348091A CN201210234809A CN102799412A CN 102799412 A CN102799412 A CN 102799412A CN 2012102348091 A CN2012102348091 A CN 2012102348091A CN 201210234809 A CN201210234809 A CN 201210234809A CN 102799412 A CN102799412 A CN 102799412A
Authority
CN
China
Prior art keywords
module
cordic
angle
data
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012102348091A
Other languages
Chinese (zh)
Inventor
毕卓
戴益君
韩冰
王镇
张莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN2012102348091A priority Critical patent/CN102799412A/en
Publication of CN102799412A publication Critical patent/CN102799412A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Complex Calculations (AREA)
  • Advance Control (AREA)

Abstract

本发明涉及一种基于并行流水线设计的CORDIC加速器。它包括一个角度预处理模块、一个CORDIC内核模块、一个数据后端处理模块和一个控制器模块。本发明先将外部数据线,控制线读入相关的数据信号和控制信号至控制器模块,随后将待求的角度传输至角度预处理模块,然后将处理好的角度与从控制器模块读入的初值一起传送至CORDIC内核模块,其中CORDIC内核模块采用并行16级流水线结构,快速计算出通过角度预处理模块后角度的正弦和余弦这两个数值,随后将计算出的数据和从控制器模块输出的相位控制信号一同传送至后端数据处理模块,判定出所对应初始相位的正余弦值的正负,最后将正弦值,余弦值传送至控制器模块,根据外部控制信号需要正弦值或余弦值,给出相关的结果。

The invention relates to a CORDIC accelerator based on parallel pipeline design. It includes an angular preprocessing module, a CORDIC kernel module, a data backend processing module and a controller module. The present invention first reads the external data line and the control line into the relevant data signal and control signal to the controller module, then transmits the angle to be requested to the angle preprocessing module, and then reads the processed angle from the controller module The initial value is sent to the CORDIC core module together, where the CORDIC core module adopts a parallel 16-stage pipeline structure to quickly calculate the two values of the sine and cosine of the angle after passing through the angle preprocessing module, and then transfer the calculated data to the slave controller The phase control signal output by the module is sent to the back-end data processing module to determine the positive and negative of the corresponding initial phase sine and cosine values, and finally the sine and cosine values are sent to the controller module, and the sine or cosine is required according to the external control signal value, giving the associated result.

Description

基于并行流水线设计的CORDIC加速器CORDIC Accelerator Based on Parallel Pipeline Design

技术领域 technical field

本发明涉及一种基于并行流水线设计的坐标旋转数字计算 (Coordinate Rotation Digital Computer,CORDIC) 加速器,具体的说是一种能快速计算超越函数的运算器,主要用于航空航天技术、机器人技术、图像信号处理、滤波技术等的运算器。 The present invention relates to a coordinate rotation digital computing (Coordinate Rotation Digital Computer, CORDIC) accelerator based on a parallel assembly line design, specifically an arithmetic unit capable of quickly calculating transcendental functions, mainly used in aerospace technology, robot technology, image processing Calculator for signal processing, filter technology, etc.

背景技术 Background technique

在日常生活中,无论是涉及航空航天、图像信号处理、数字通信、视频技术或制图应用还是实际测量计算、数值分析、概率统计运动矢量估值等各个科学技术领域,高精确的三角函数运算在实际工程中有相当广泛的应用。因此,研究并用硬件设计实现较高精度和快速的三角函数运算是十分重要的。硬件实现的数学函数算法,按照数学公式和对应的实现方式的不同,可分为以下几类:查表方式、多项式近似方式、查表与多项式结合的方式、有理数近似方式和逐位方式以及CORDIC算法。 In daily life, whether it involves various scientific and technological fields such as aerospace, image signal processing, digital communication, video technology or drawing applications, or actual measurement calculation, numerical analysis, probability and statistical motion vector estimation, etc., high-precision trigonometric function operations are in There are quite a wide range of applications in practical engineering. Therefore, it is very important to research and use hardware design to realize higher precision and fast trigonometric function calculation. The mathematical function algorithm implemented by hardware can be divided into the following categories according to the different mathematical formulas and corresponding implementation methods: table lookup method, polynomial approximation method, combination method of table lookup and polynomial, rational number approximation method and bitwise method, and CORDIC algorithm.

CORDIC是由J.D.Volde于1959年首次提出,该算法是一种递归算法,通过引入确定的初值,结合上简单的移位和加减法,就能实现这些复杂的函数运算。为了扩展可解决的基本函数的个数,J.Walter于1971年提出了统一的CORDIC算法。现有的CORDIC迭代运算装置用于浮点协处理器,能支持的函数种类繁多,包括算术运算、三角运算、指数运算等等,为节省资源,采用统一形式的CORDIC算法来实现所有函数的运算,如公告号为CN102073472A,授权公开日2008年4月9日的中国发明专利说明书公开的一种名为“一种三角函数CORDIC迭代运算迭代处理器及运算处理方法”。该专利提供一种三角函数CORDIC迭代协处理器及运算处理方法,通过三角函数的输入变换电路使输入参数变换到CORDIC算法允许的输入范围之内,从而达到支持全角度的三角函数运算,并且没有输入范围的限制,但是该专利在进行CORDIC运算之前需要一个54*54的乘法器,这极大增加了面积和延迟时间。 CORDIC was first proposed by J.D.Volde in 1959. This algorithm is a recursive algorithm. By introducing a certain initial value, combined with simple shift and addition and subtraction, these complex function operations can be realized. In order to expand the number of basic functions that can be solved, J.Walter proposed a unified CORDIC algorithm in 1971. The existing CORDIC iterative operation device is used for floating-point coprocessors, which can support a wide variety of functions, including arithmetic operations, trigonometric operations, exponential operations, etc. In order to save resources, a unified form of CORDIC algorithm is used to realize the operations of all functions , such as the announcement number is CN102073472A, a Chinese invention patent specification published on April 9, 2008, which is authorized to be published, is called "a trigonometric function CORDIC iterative operation iterative processor and operation processing method". This patent provides a trigonometric function CORDIC iterative coprocessor and operation processing method, through the input conversion circuit of trigonometric functions, the input parameters are transformed into the input range allowed by the CORDIC algorithm, so as to support the trigonometric function operation of all angles, and there is no The input range is limited, but the patent requires a 54*54 multiplier before the CORDIC operation, which greatly increases the area and delay time.

相对于实时要求和面积需求上,与常规的CORDIC迭代运算方法及相关电路实现技术相比,本发明的优势在于同时协调满足了全角度计算、低时延、小面积、高精度这四方面要求。解决了在高精度的保证下,做到最小面积且快速计算出所需的三角函数值。 Compared with the conventional CORDIC iterative calculation method and related circuit implementation technology, the present invention has the advantage of simultaneously meeting the four requirements of full-angle calculation, low delay, small area, and high precision in terms of real-time requirements and area requirements. . It solves the problem of achieving the minimum area and quickly calculating the required trigonometric function values under the guarantee of high precision.

发明内容 Contents of the invention

本发明的目的是:为了解决在速度保证的情况下,面积偏大与精度不高的问题,提供一种基于并行流水线设计的CORDIC加速器,具有低成本、高吞吐量、小面积、高精度的特点。 The purpose of the present invention is to provide a CORDIC accelerator based on parallel pipeline design in order to solve the problems of large area and low precision under the condition of guaranteed speed, which has low cost, high throughput, small area and high precision. features.

本发明的技术方案是:一种基于并行流水线设计的CORDIC 加速器,包括:一个角度预处理模块,一个CORDIC内核模块,一个后端数据处理模块,一个控制器模块。其基本特征在于所述的CORDIC内核模块前端连接角度预处理模块,后端连接数据处理模块,所述控制器模块连接角度预处理模块、CORDIC内核模块和数据处理模块,将外部数据线、控制线读入相关的数据信号和控制信号至控制器模块,随后将待求的角度传输至角度预处理模块,然后将处理好的角度与从控制器模块读入的初值一起传送至CORDIC内核模块,CORDIC内核模块采用并行流水线结构,快速计算出通过角度预处理模块后角度的正弦和余弦这两个数值,随后将计算出的数据和从控制器模块输出的相位控制信号一同传送至后端数据处理模块,判定出所对应初始相位的正余弦值的正负,最后将正弦值、余弦值传送至控制器模块,根据外部控制信号需要正弦值或余弦值,给出相关的结果,如图1所示:通过外部数据线,控制线读入相关的数据信号和控制信号至控制器模块,随后再将待求的角度传输至角度预处理模块,将全角度模块压缩至[0°, 90°],而不局限于(-99.8°,99.8°),然后将处理好的角度与从控制器模块读入的初值一起传送至CORDIC内核模块,CORDIC采用并行流水线结构,快速计算出通过角度预处理模块后角度的正弦和余弦这两个数值,随后将计算出的数据和从控制器模块输出的相位控制信号一同传送至后端数据处理模块,判定出所对应初始相位的正余弦值的正负,最后将正弦值,余弦值传送至控制器模块,根据外部控制信号需要正弦值或余弦值,给出相关的结果。每计算完一次正余弦值,控制器便会给出一个响应信号给外部,便于调控。 The technical scheme of the present invention is: a kind of CORDIC accelerator based on parallel pipeline design, comprising: an angle preprocessing module, a CORDIC core module, a back-end data processing module, and a controller module. Its basic feature is that the front end of the CORDIC core module is connected to the angle preprocessing module, the back end is connected to the data processing module, the controller module is connected to the angle preprocessing module, the CORDIC core module and the data processing module, and the external data line and the control line are connected to each other. Read in the relevant data signals and control signals to the controller module, then transmit the angle to be requested to the angle preprocessing module, and then transmit the processed angle to the CORDIC kernel module together with the initial value read from the controller module, The CORDIC core module adopts a parallel pipeline structure to quickly calculate the two values of the sine and cosine of the angle after passing through the angle preprocessing module, and then transmit the calculated data and the phase control signal output from the controller module to the back-end data processing The module determines the positive and negative of the positive and cosine values of the corresponding initial phase, and finally transmits the sine and cosine values to the controller module. According to the external control signal, the sine or cosine values are required, and the relevant results are given, as shown in Figure 1 : Through the external data line, the control line reads in the relevant data signal and control signal to the controller module, and then transmits the angle to be requested to the angle preprocessing module, compressing the full angle module to [0°, 90°], It is not limited to (-99.8°, 99.8°), and then transmits the processed angle to the CORDIC kernel module together with the initial value read from the controller module. CORDIC adopts a parallel pipeline structure to quickly calculate the angle through the preprocessing module The two values of the sine and cosine of the rear angle, and then the calculated data and the phase control signal output from the controller module are sent to the back-end data processing module to determine whether the positive and negative values of the corresponding initial phase are positive or negative, and finally The sine and cosine values are sent to the controller module, and the sine or cosine values are required according to the external control signal, and the relevant results are given. Every time the sine and cosine values are calculated, the controller will give a response signal to the outside, which is convenient for regulation.

上述的CORDIC内核模块采用并行流水线的模式,可以同时快速地计算出所需角度的正弦,余弦值,具有较高的吞吐量。选用32次的迭代,可以使得到的正余弦值精度达到10-8。对应移位寄存器,每一次迭代均只需移动一位,由于采用流水线的结构,可直接将上一级流水线的数据送入至下一级,所以错位相连即可实现,因此降低了移位寄存器的面积。此外在CORDIC算法的计算过程中每一次迭代都需要通过查找表找到相应的反正切的角度值,而本发明由于迭代次数固定,可以直接将各级的正切值对应相应的高低电平,即无需再向外部调用数值,一方面提高硬件的运算速度,另一方面也减小了所需的ROM面积。在CORDIC内核模块中,为保证尽可能小的面积,较高速度计算出相关函数值,本发明采用了32次迭代,16级流水线形式。 The above-mentioned CORDIC kernel module adopts a parallel pipeline mode, which can quickly calculate the sine and cosine values of the required angle at the same time, and has high throughput. By choosing 32 iterations, the precision of the obtained sinine and cosine values can reach 10 −8 . Corresponding to the shift register, each iteration only needs to move one bit. Due to the pipeline structure, the data of the upper pipeline can be directly sent to the next level, so the dislocation connection can be realized, thus reducing the shift register. area. In addition, in the calculation process of the CORDIC algorithm, each iteration needs to find the corresponding arctangent angle value through the lookup table, and because the number of iterations is fixed in the present invention, the tangent value of each level can be directly corresponding to the corresponding high and low levels, that is, no need Calling the value to the outside, on the one hand, improves the computing speed of the hardware, and on the other hand, reduces the required ROM area. In the CORDIC kernel module, in order to ensure the smallest possible area and calculate the correlation function value at a high speed, the present invention adopts 32 iterations and a 16-stage pipeline form.

上述的角度预处理模块由四个加法器和一个四选一选择器组成。其特征在于所述的四选一选择器的输入端连接四个加法器的输出结果;根据不同的角度输入,通过一级加减运算,再配合相位控制信号选择输出就能够将任意的输入角度均转化到[0°,90°]范围内,使得CORDIC整个模块运算过程不再受角度范围的约束。 The above-mentioned angle preprocessing module is composed of four adders and a four-to-one selector. It is characterized in that the input end of the four-choice selector is connected to the output results of four adders; according to different angle inputs, through one-stage addition and subtraction operations, and then cooperate with the phase control signal to select the output, any input angle can be selected. All of them are transformed into the range of [0°,90°], so that the entire module operation process of CORDIC is no longer constrained by the angle range.

上述的后端数据处理模块由2个反相器、2个四选一选择器和4个寄存器组成,其特征在于2个四选一选择器前端连接经CORDIC内核模块处理完的数据,后端连接正余弦寄存器及误差寄存器,相位寄存器用来控制四选一选择的输出结果。对从CORDIC模块输出的数据结果根据相位控制信号进行正负号划分。 The above-mentioned back-end data processing module is made up of 2 inverters, 2 four-selectors and 4 registers, and is characterized in that the front-ends of the two four-selectors are connected to the data processed by the CORDIC kernel module, and the back-end Connect the sine-cosine register and the error register, and the phase register is used to control the output result of one of the four selections. The data results output from the CORDIC module are divided into positive and negative signs according to the phase control signal.

上述的控制器模块,主要由一些寄存器和选择器组成,用来调控整个CORDIC模块中各个子模块的协调运算,并根据外部的控制信号,给出状态值和结果值。 The above-mentioned controller module is mainly composed of some registers and selectors, which are used to regulate the coordinated operation of each sub-module in the entire CORDIC module, and give status values and result values according to external control signals.

本发明与现有技术相比较,具有如下显而易见的突出实质性特点和显著进步: Compared with the prior art, the present invention has the following obvious outstanding substantive features and significant progress:

(1)    本发明各模块的体积较小,结构简单,所以总体体积小,便于在实现。 (1) The volume of each module of the present invention is small and the structure is simple, so the overall volume is small and easy to realize.

(2)    本发明的CORDIC模块,采用并行流水线结构方式实现,特别是内部的移位寄存器,采用折线实现,使得整个模块时延小,面积小。 (2) The CORDIC module of the present invention is implemented in a parallel pipeline structure, especially the internal shift register is implemented in broken lines, so that the entire module has a small time delay and a small area.

(3)    本发明的CORDIC模块,所需的正切值直接对应固定的高低电平,无需在通过查表调用,使整个模块快速执行,且所占面积小。 (3) In the CORDIC module of the present invention, the required tangent value directly corresponds to the fixed high and low levels, and there is no need to call through the look-up table, so that the entire module can be executed quickly, and the occupied area is small.

(4)    本发明的CORDIC模块采用的是32次迭代,得到的数据精度可到达10-8。采用16级流水线,整体计算的吞吐量大,达到快速计算的目的。 (4) The CORDIC module of the present invention uses 32 iterations, and the accuracy of the obtained data can reach 10 -8 . Using 16-stage pipeline, the overall calculation throughput is large, achieving the purpose of fast calculation.

附图说明 Description of drawings

图1是系统控制总框图。 Figure 1 is a general block diagram of the system control.

图2是平面坐标旋转图。 Figure 2 is a plane coordinate rotation diagram.

图3是角度预处理模块结构图。 Figure 3 is a structural diagram of the angle preprocessing module.

图4是CORDIC内核模块结构图。 Figure 4 is a structure diagram of the CORDIC kernel module.

图5是一级流水线的结构框架图。 Figure 5 is a structural frame diagram of a pipeline.

图6是一次迭代的结构框架图。 Figure 6 is a structural framework diagram of an iteration.

图7是数据后处理模块结构图。 Fig. 7 is a structural diagram of the data post-processing module.

图8是整个CORDIC模块时序图。 Figure 8 is a timing diagram of the entire CORDIC module.

具体实施方式 Detailed ways

本发明的优选实施例结合附图详细说明如下: Preferred embodiments of the present invention are described in detail as follows in conjunction with accompanying drawings:

实施例一: Embodiment one:

如图1所示,本发明基于并行流水线设计的CORDIC 加速器,包括:一个角度预处理模块(1)、一个CORDIC内核模块(2)、一个数据后端处理模块(3)和一个控制器模块(4)。其特征在于先将外部数据线,控制线读入相关的数据信号和控制信号至控制器模块(4),随后将待求的角度传输至角度预处理模块(1),然后将处理好的角度与从控制器模块(4)读入的初值一起传送至CORDIC内核模块(2),CORDIC内核模块(2)采用并行流水线结构,快速计算出通过角度预处理模块(1)后角度的正弦和余弦这两个数值,随后将计算出的数据和从控制器模块(4)输出的相位控制信号一同传送至后端数据处理模块(3),判定出所对应初始相位的正余弦值的正负,最后将正弦值,余弦值传送至控制器模块(4),根据外部控制信号需要正弦值或余弦值,给出相关的结果。 As shown in Figure 1, the CORDIC accelerator designed based on the parallel pipeline of the present invention includes: an angle preprocessing module (1), a CORDIC kernel module (2), a data back-end processing module (3) and a controller module ( 4). It is characterized in that the external data lines and control lines are first read in the relevant data signals and control signals to the controller module (4), and then the angle to be requested is transmitted to the angle preprocessing module (1), and then the processed angle The initial value read from the controller module (4) is sent to the CORDIC kernel module (2), and the CORDIC kernel module (2) adopts a parallel pipeline structure to quickly calculate the sine sum of the angle after passing through the angle preprocessing module (1) The two values of cosine are then sent to the back-end data processing module (3) together with the calculated data and the phase control signal output from the controller module (4), to determine the positive and negative of the corresponding initial phase positive and cosine values, Finally, the sine value and cosine value are sent to the controller module (4), and the sine value or cosine value is required according to the external control signal, and relevant results are given.

实施例二: Embodiment two:

本实施例与实施例一基本相同,特别之处如下: This embodiment is basically the same as Embodiment 1, and the special features are as follows:

本发明采用的圆周旋转模式下的CORDIC算法,图2是CORDIC算法的圆周系统完成的一个平面坐标旋转。其中,图2.a表示的是从原始A点(x1,y1),经过旋转角度θ后到达目标点B(x2,y2),图2.b表示的是经过第一次CORDIC角度旋转之后,达到A’(x1’,y1’), 图2.c表示的是经过第二次CORDIC角度旋转之后,到达A”(x1”,y1”),经过多次旋转之后,An将逼近目标点B。 For the CORDIC algorithm under the circular rotation mode adopted by the present invention, Fig. 2 is a plane coordinate rotation completed by the circular system of the CORDIC algorithm. Among them, Figure 2.a shows that from the original point A (x1, y1), after the rotation angle θ, it reaches the target point B (x2, y2), and Figure 2.b shows that after the first CORDIC angle rotation, Reach A'(x1',y1'), Figure 2.c shows that after the second CORDIC angle rotation, A"(x1",y1") is reached, after multiple rotations, An will approach the target point B .

<一>、CORDIC内核模块 <1>, CORDIC kernel module

参见图4,此模块采用并行,16级流水线结构进行实现,提高了计算吞吐量。每一级由两次32位迭代和一次32位寄存器组成,参见图5,采用32次迭代方式使得整个运算得到的精度达到10-8。每一次迭代由一个32位加减法器和一个32位的移位器组成,参见图6。CORDIC内核模块中主要是32位加减法器、32位寄存器及移位器的设计,其中加减法器是CORDIC内核模块的核心部件。本发明选用的加减法器是以超前进位加法器为基础,并选择组内超前,组间串行的进位方式,每个组的进位传播过程都可以被看作是一个单独的实例单元。本发明选用的寄存器是基于传统的传输门构成的寄存器,外加由与门、或门组合的逻辑控制电路,使寄存器具有控制清零和使能功能。由于采用流水线的结构,参见图4,每次迭代为按序移位,因此数据直接从上一级传送至下一级,本专利仅用折线来表示相应的移位寄存器。另外,CORDIC算法的计算过程中每一次迭代所需要的反正切角度值,本专利均用已连接好高低电平的折线代替,提高硬件的运算速度。 Referring to Figure 4, this module is implemented with a parallel, 16-stage pipeline structure, which improves the computing throughput. Each stage is composed of two 32-bit iterations and one 32-bit register. Referring to FIG. 5 , 32 iterations are used to make the accuracy of the entire operation reach 10 -8 . Each iteration consists of a 32-bit adder and subtractor and a 32-bit shifter, see Figure 6. The core module of CORDIC is mainly the design of 32-bit adder and subtractor, 32-bit register and shifter, among which the adder-subtractor is the core component of the CORDIC kernel module. The adder and subtractor selected by the present invention is based on the advanced carry adder, and selects the advance in the group, the serial carry mode between the groups, and the carry propagation process of each group can be regarded as a separate instance unit . The register selected by the present invention is based on a traditional transmission gate, plus a logic control circuit combined with an AND gate and an OR gate, so that the register has the functions of controlling clearing and enabling. Due to the pipeline structure, as shown in Fig. 4, each iteration is a sequential shift, so the data is directly transmitted from the upper stage to the next stage, and this patent only uses broken lines to represent the corresponding shift registers. In addition, the arc tangent angle value required for each iteration in the calculation process of the CORDIC algorithm is replaced by a broken line connected with high and low levels in this patent to improve the computing speed of the hardware.

<二>、角度预处理模块 <2>, angle preprocessing module

参见图3,此模块由选择器,加法器组成。作用是对不同的输入角度压缩到[0°,90°]之间,将压缩后的角度值传送给CORDIC内核模块,从而不再受CORDIC内核角度范围的限制,快速的计算出其数据结果。角度预处理模块中的Phase_1,Phase_2,Phase_3,Phase_4信号线为分别代表第一象限,第二象限,第三象限和第四象限,选取输入角度的最高两位为检测符号位,用Ph_c信号线表示。当Ph_c检测到为第二象限的角度时,输出数据选择经过第二象限处理后的角度。 See Figure 3, this module consists of a selector and an adder. The function is to compress different input angles to [0°, 90°], and transmit the compressed angle value to the CORDIC kernel module, so that it is no longer limited by the angle range of the CORDIC kernel, and the data result can be calculated quickly. The Phase_1, Phase_2, Phase_3, and Phase_4 signal lines in the angle preprocessing module represent the first quadrant, the second quadrant, the third quadrant, and the fourth quadrant respectively. The highest two bits of the input angle are selected as the detection sign bit, and the Ph_c signal line is used express. When Ph_c detects that it is the angle of the second quadrant, the output data selects the angle processed by the second quadrant.

<三>、后端数据处理模块 <3>, back-end data processing module

参见图7,此模块由选择器,加法器及寄存器组成。作用是对通过CORDIC内核模块得到的正余弦值按相位角度进行符号划分。后端数据处理模块分别与CORDIC内核模块的输出数据端和控制器的使能信号端,清零信号端及相角控制端相连。当控制器模块的清零信号为低,使能信号为高时,后端数据处理模块开始进行正负号判定,并对从CORDIC内核得到的结果进行计算。 See Fig. 7, this module is made up of selector, adder and register. The function is to divide the sine and cosine values obtained by the CORDIC kernel module according to the phase angle. The back-end data processing module is respectively connected with the output data terminal of the CORDIC kernel module and the enabling signal terminal of the controller, the clearing signal terminal and the phase angle control terminal. When the clear signal of the controller module is low and the enable signal is high, the back-end data processing module starts to judge the sign and calculate the result obtained from the CORDIC core.

<四>、控制器模块 <4>, controller module

此模块主要由一些选择器和寄存器组成,作用是控制整个CORDIC运算时序。对应整个CORDIC模块,参见图8,其中角度预处理模块占一个时钟周期,CORDIC内核模块占十六个时钟周期,后端数据处理模块占一个时钟周期。当reset信号为低时,使整个CORDIC模块全部清零,当reset信号为高,且启动信号start为高时,开始数据输入计算,经过十八个时钟周期后,输出状态信号ok为高,并输出数据结果。另外,信号线n选择是sin值输出或者cos值输出,根据外部的实际需要,选择输出其对应的结果。由于采用流水线结构,因此吞吐量大,计算速度快。 This module is mainly composed of some selectors and registers, and its function is to control the entire CORDIC operation timing. Corresponding to the entire CORDIC module, see Figure 8, where the angle preprocessing module occupies one clock cycle, the CORDIC kernel module occupies sixteen clock cycles, and the back-end data processing module occupies one clock cycle. When the reset signal is low, the entire CORDIC module is cleared. When the reset signal is high and the start signal start is high, the data input calculation is started. After eighteen clock cycles, the output status signal ok is high, and Output data results. In addition, the signal line n can be selected as sin value output or cos value output, and the corresponding result can be selected and output according to external actual needs. Due to the pipeline structure, the throughput is large and the calculation speed is fast.

Claims (7)

1.一种基于并行流水线设计的CORDIC加速器,包括一个角度预处理模块(1)、一个CORDIC内核模块(2)、一个数据后端处理模块(3)和一个控制器模块(4),其特征在于所述的CORDIC内核模块(2)前端连接角度预处理模块(1),后端连接数据处理模块(3),所述控制器模块(4)连接角度预处理模块(1)、CORDIC内核模块(2)和数据处理模块(3),将外部数据线、控制线读入相关的数据信号和控制信号至控制器模块(4),随后将待求的角度传输至角度预处理模块(1),然后将处理好的角度与从控制器模块(4)读入的初值一起传送至CORDIC内核模块(2),CORDIC内核模块(2)采用并行流水线结构,快速计算出通过角度预处理模块(1)后角度的正弦和余弦这两个数值,随后将计算出的数据和从控制器模块(4)输出的相位控制信号一同传送至后端数据处理模块(3),判定出所对应初始相位的正余弦值的正负,最后将正弦值、余弦值传送至控制器模块(4),根据外部控制信号需要正弦值或余弦值,给出相关的结果。 1. A CORDIC accelerator based on parallel pipeline design, including an angle preprocessing module (1), a CORDIC kernel module (2), a data back-end processing module (3) and a controller module (4), its characteristics The front end of the CORDIC kernel module (2) is connected to the angle preprocessing module (1), the back end is connected to the data processing module (3), and the controller module (4) is connected to the angle preprocessing module (1) and the CORDIC kernel module (2) and data processing module (3), read the external data lines and control lines into relevant data signals and control signals to the controller module (4), and then transmit the angle to be requested to the angle preprocessing module (1) , and then transmit the processed angle to the CORDIC kernel module (2) together with the initial value read from the controller module (4). 1) The two values of the sine and cosine of the rear angle, and then the calculated data and the phase control signal output from the controller module (4) are sent to the back-end data processing module (3), and the corresponding initial phase is determined The positive and negative of the positive and cosine values, and finally the sine and cosine values are sent to the controller module (4), and the sine or cosine values are required according to the external control signal, and the relevant results are given. 2.根据权利要求1所述的基于并行流水线设计的CORDIC加速器,其特征在于所述的角度预处理模块(1)由四个加法器和一个四选一选择器组成,四选一选择器的输入端连接四个加法器的输出结果;根据不同的角度输入,通过一级加减运算,再配合相位控制信号选择输出就能够将任意的输入角度均转化到[0°,90°]范围内,使得CORDIC整个模块运算过程不再受角度范围的约束。 2. The CORDIC accelerator based on parallel pipeline design according to claim 1, characterized in that said angle preprocessing module (1) is made up of four adders and a four-choice selector, and the four-choose one selector The input terminal is connected to the output results of four adders; according to different angle inputs, through one-stage addition and subtraction operations, and then select the output with the phase control signal, any input angle can be converted into the range of [0°, 90°] , so that the entire CORDIC module operation process is no longer constrained by the angle range. 3.根据权利要求1所述的基于并行流水线设计的CORDIC加速器,其特征在于所述的CORDIC内核模块(2)由加法器,寄存器组成,将两级加法得到的结果送至寄存器中;采用的的是并行流水线结构模式:采用了16级流水线使运算速度快,吞吐量高,采用32次迭代方式使得整个运算得到的精度达到10-83. the CORDIC accelerator based on parallel pipeline design according to claim 1, is characterized in that described CORDIC core module (2) is made up of adder, register, and the result that two-stage addition obtains is sent in the register; Adopted What is more important is the parallel pipeline structure mode: the 16-stage pipeline is adopted to make the calculation speed fast and the throughput high, and the 32 iterations are used to make the precision of the whole calculation reach 10 -8 . 4.根据权利要求1所述的基于并行流水线设计的CORDIC加速器,其特征在于所述的CORDIC内核模块(2)由加法器和寄存器组成,由于CORDIC算法的计算过程中每一次迭代所需要的反正切角度值,且各个正切角度值确定,因此用已连接好高低电平的折线代替寄存器存储数据,提高硬件的运算速度。 4. the CORDIC accelerator based on parallel pipeline design according to claim 1, is characterized in that described CORDIC core module (2) is made up of adder and register, because every iteration needed anyway in the calculation process of CORDIC algorithm Tangent angle value, and each tangent angle value is determined, so the broken line connected with the high and low levels is used instead of the register to store data, and the computing speed of the hardware is improved. 5.根据权利要求1所述的基于并行流水线设计的CORDIC加速器,其特征在于所述的CORDIC内核模块(2)由加法器和寄存器组成,采用流水线的结构,每次迭代为按序移位,数据可直接从两级加法器运算结果输入至寄存器,仅用折线来表示相应的移位寄存器,减小了面积。 5. the CORDIC accelerator based on parallel pipeline design according to claim 1, is characterized in that described CORDIC core module (2) is made up of adder and register, adopts the structure of pipeline, and each iteration is shift in order, The data can be directly input to the register from the operation result of the two-stage adder, and only a broken line is used to represent the corresponding shift register, which reduces the area. 6.根据权利要求1所述的基于并行流水线设计的CORDIC加速器,其特征在于所述的后端数据处理模块(3)由2个反相器、2个四选一选择器和4个寄存器组成,2个四选一选择器前端连接经CORDIC内核模块(2)处理完的数据,后端连接正余弦寄存器及误差寄存器,相位寄存器用来控制四选一选择的输出结果;对从CORDIC模块(2)输出的数据结果根据相位控制信号进行正负号划分。 6. The CORDIC accelerator based on parallel pipeline design according to claim 1, characterized in that said back-end data processing module (3) consists of 2 inverters, 2 four selectors and 4 registers , the front end of two four-choice selectors is connected to the data processed by the CORDIC kernel module (2), the back-end is connected to the sine-cosine register and the error register, and the phase register is used to control the output result of the four-choice one selection; for the CORDIC module ( 2) The output data results are divided into positive and negative signs according to the phase control signal. 7.根据权利要求1所述的基于并行流水线设计的CORDIC加速器,其特征在于所述的控制器模块(4)由一个有限状态机组成,对CORDIC整个运算进行调整控制,使整个运算有序进行。 7. The CORDIC accelerator based on parallel pipeline design according to claim 1, characterized in that the controller module (4) is composed of a finite state machine, which adjusts and controls the entire operation of CORDIC, so that the entire operation is carried out in an orderly manner .
CN2012102348091A 2012-07-09 2012-07-09 CORDIC (coordinate rotation digital computer) accelerator based on parallel pipeline design Pending CN102799412A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012102348091A CN102799412A (en) 2012-07-09 2012-07-09 CORDIC (coordinate rotation digital computer) accelerator based on parallel pipeline design

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012102348091A CN102799412A (en) 2012-07-09 2012-07-09 CORDIC (coordinate rotation digital computer) accelerator based on parallel pipeline design

Publications (1)

Publication Number Publication Date
CN102799412A true CN102799412A (en) 2012-11-28

Family

ID=47198529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012102348091A Pending CN102799412A (en) 2012-07-09 2012-07-09 CORDIC (coordinate rotation digital computer) accelerator based on parallel pipeline design

Country Status (1)

Country Link
CN (1) CN102799412A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104714773A (en) * 2015-03-04 2015-06-17 中国航天科技集团公司第九研究院第七七一研究所 Embedded rotation angle calculation IP soft core based on PLB bus and rotation angle calculation method
CN107102841A (en) * 2017-04-06 2017-08-29 上海晟矽微电子股份有限公司 A kind of coordinate transform parallel calculating method and device
WO2017185390A1 (en) * 2016-04-26 2017-11-02 北京中科寒武纪科技有限公司 Apparatus and method for executing transcendental function operation of vectors
CN109614073A (en) * 2018-10-28 2019-04-12 西南电子技术研究所(中国电子科技集团公司第十研究所) Four-quadrant arctan function hardware circuit implementation
CN111666065A (en) * 2020-06-03 2020-09-15 合肥工业大学 Trigonometric function pipeline iteration solving method and device based on CORDIC
CN112650973A (en) * 2019-10-11 2021-04-13 珠海格力电器股份有限公司 Trigonometric function calculation device and electronic equipment
CN112989269A (en) * 2021-03-26 2021-06-18 上海西井信息科技有限公司 Accelerator and on-chip calculation module thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989009447A1 (en) * 1988-03-29 1989-10-05 Yulun Wang Three-dimensional vector processor
US6480871B1 (en) * 1999-04-07 2002-11-12 Dhananjay S. Phatak Algorithm (Method) and VLSI architecture for fast evaluation of trigonometric functions
CN101183405A (en) * 2007-11-30 2008-05-21 西安交通大学 Realization method of small world algorithm hardware platform based on FPGA
CN102073472A (en) * 2011-01-05 2011-05-25 东莞市泰斗微电子科技有限公司 Trigonometric function CORDIC iteration operation coprocessor and operation processing method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989009447A1 (en) * 1988-03-29 1989-10-05 Yulun Wang Three-dimensional vector processor
US6480871B1 (en) * 1999-04-07 2002-11-12 Dhananjay S. Phatak Algorithm (Method) and VLSI architecture for fast evaluation of trigonometric functions
CN101183405A (en) * 2007-11-30 2008-05-21 西安交通大学 Realization method of small world algorithm hardware platform based on FPGA
CN102073472A (en) * 2011-01-05 2011-05-25 东莞市泰斗微电子科技有限公司 Trigonometric function CORDIC iteration operation coprocessor and operation processing method thereof

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ZHANG JUNTAO, ET AL.: "implementation of general cordic ip core based on FPGA", 《COMPUTER SCIENCE》 *
戴益君等: "CORDIC algorithm based on FPGA", 《上海大学学报英文版》 *
毕卓等: "全定制CORDIC运算器设计", 《计算机工程与科学》 *
车力等: "基于CORDIC算法的正余弦函数FPGA实现", 《西安工程大学学报》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104714773A (en) * 2015-03-04 2015-06-17 中国航天科技集团公司第九研究院第七七一研究所 Embedded rotation angle calculation IP soft core based on PLB bus and rotation angle calculation method
CN104714773B (en) * 2015-03-04 2018-04-20 中国航天科技集团公司第九研究院第七七一研究所 The soft core of the embedded IP based on PLB buses and anglec of rotation computational methods calculated for the anglec of rotation
WO2017185390A1 (en) * 2016-04-26 2017-11-02 北京中科寒武纪科技有限公司 Apparatus and method for executing transcendental function operation of vectors
CN107102841A (en) * 2017-04-06 2017-08-29 上海晟矽微电子股份有限公司 A kind of coordinate transform parallel calculating method and device
CN109614073A (en) * 2018-10-28 2019-04-12 西南电子技术研究所(中国电子科技集团公司第十研究所) Four-quadrant arctan function hardware circuit implementation
CN109614073B (en) * 2018-10-28 2023-08-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Four-quadrant arctangent function hardware realization circuit
CN112650973A (en) * 2019-10-11 2021-04-13 珠海格力电器股份有限公司 Trigonometric function calculation device and electronic equipment
CN112650973B (en) * 2019-10-11 2022-05-20 珠海格力电器股份有限公司 Trigonometric function calculation device and electronic equipment
CN111666065A (en) * 2020-06-03 2020-09-15 合肥工业大学 Trigonometric function pipeline iteration solving method and device based on CORDIC
CN111666065B (en) * 2020-06-03 2022-10-18 合肥工业大学 Method and device for iterative solution of trigonometric function pipeline based on CORDIC
CN112989269A (en) * 2021-03-26 2021-06-18 上海西井信息科技有限公司 Accelerator and on-chip calculation module thereof
CN112989269B (en) * 2021-03-26 2023-07-25 上海西井科技股份有限公司 Accelerator and on-chip computing module for accelerator

Similar Documents

Publication Publication Date Title
CN102799412A (en) CORDIC (coordinate rotation digital computer) accelerator based on parallel pipeline design
CN110084361B (en) A computing device and method
Zhang et al. An improved sobel edge algorithm and FPGA implementation
CN105681628B (en) A kind of convolutional network arithmetic element and restructural convolutional neural networks processor and the method for realizing image denoising processing
CN106250103A (en) A kind of convolutional neural networks cyclic convolution calculates the system of data reusing
CN110688158A (en) Computing device and processing system of neural network
CN108733348A (en) The method for merging vector multiplier and carrying out operation using it
CN105468335A (en) Pipeline-level operation device, data processing method and network-on-chip chip
WO2022001550A1 (en) Address generation method, related device and storage medium
CN111381808B (en) Multiplier, data processing method, chip and electronic device
CN110515589A (en) Multiplier, data processing method, chip and electronic device
TWI774093B (en) Converter, chip, electronic equipment and method for converting data types
Li et al. Study of CORDIC algorithm based on FPGA
CN102360281B (en) Multifunctional fixed-point media access control (MAC) operation device for microprocessor
CN102663666A (en) Two-dimensional image resampling algorithm accelerator based on field-programmable gate array (FPGA)
CN101866278B (en) A 64-bit integer multiplier with asynchronous iteration and its calculation method
US20180032336A1 (en) Processor and method for executing instructions on processor
CN102129419A (en) Fast Fourier transform-based processor
CN102789446A (en) DDS (Direct Digital Synthesizer) signal spurious suppression method and system on basis of CORDIC (Coordinated Rotation Digital Computer) algorithm
CN103713878B (en) A kind of method that sine and cosine cordic algorithm applying complement method realizes at FPGA
CN111260042B (en) Data selector, data processing method, chip and electronic device
CN103067718B (en) Be applicable to the one-dimensional discrete cosine inverse transform module circuit of digital video decoding
CN102298568A (en) Method and device for switching configuration information of dynamic reconfigurable array
WO2022001500A1 (en) Computing apparatus, integrated circuit chip, board card, electronic device, and computing method
CN209895329U (en) multiplier

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121128