[go: up one dir, main page]

CN101477456B - Self-correlated arithmetic unit and processor - Google Patents

Self-correlated arithmetic unit and processor Download PDF

Info

Publication number
CN101477456B
CN101477456B CN2009101050581A CN200910105058A CN101477456B CN 101477456 B CN101477456 B CN 101477456B CN 2009101050581 A CN2009101050581 A CN 2009101050581A CN 200910105058 A CN200910105058 A CN 200910105058A CN 101477456 B CN101477456 B CN 101477456B
Authority
CN
China
Prior art keywords
unit
register
data
autocorrelation
shift register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009101050581A
Other languages
Chinese (zh)
Other versions
CN101477456A (en
Inventor
焦玉中
王新安
倪学文
刘雪娇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Hengxing Strategy Investment Ltd
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN2009101050581A priority Critical patent/CN101477456B/en
Publication of CN101477456A publication Critical patent/CN101477456A/en
Application granted granted Critical
Publication of CN101477456B publication Critical patent/CN101477456B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Complex Calculations (AREA)

Abstract

本发明公开了一种自相关运算单元和处理器,自相关运算单元包括:相关数处理单元、乘法器、加法器、第二移位寄存器和寄存器,处理器通过执行自相关操作指令,对自相关运算单元的功能进行配置,按照源操作数中的数据元素的顺序对源操作数依次进行读取和执行自相关运算,并且依次将目的操作数的若干个数据元素输出,从而对自相关操作指令可只进行一次读取、译码和执行,简化自相关运算的复杂程度。

Figure 200910105058

The invention discloses an autocorrelation operation unit and a processor. The autocorrelation operation unit includes: a correlation number processing unit, a multiplier, an adder, a second shift register and a register. Configure the function of the relevant operation unit, read the source operand in sequence according to the order of the data elements in the source operand and perform the autocorrelation operation, and output several data elements of the destination operand in sequence, so as to perform the autocorrelation operation Instructions can be read, decoded and executed only once, which simplifies the complexity of autocorrelation operations.

Figure 200910105058

Description

一种自相关运算单元及处理器 A kind of autocorrelation operation unit and processor

【技术领域】【Technical field】

本发明涉及集成电路设计领域,尤其涉及一种处理器及一种自相关运算单元。The invention relates to the field of integrated circuit design, in particular to a processor and an autocorrelation computing unit.

【背景技术】【Background technique】

在通信领域,经常需要对一序列数据元素进行各种运算,例如自相关运算,图1为自相关操作的原理示意图。数据流100每个采样周期沿数据流向移动一位,箭头所指为数据流向。数据流100包括若干个数据元素101,102和103分别是最新进入和即将移出窗口A106的数据元素。104和105分别是最新进入和即将移出窗口B107的数据元素。自相关操作结果是进入窗口A中的多个数据元素与对应的进入窗口B中的多个数据元素的共轭值的乘积之和。当数据元素102和104为同一数据时,变为自相关运算的一种特殊方式,即求数据元素的平均功率运算。现有的处理器在处理自相关运算时需要每次取两个源操作数,然后把乘积执行结果存入存储器,之后再于前次自相关结果相加/减,因而操作复杂,还有待改进。In the communication field, it is often necessary to perform various operations on a sequence of data elements, such as autocorrelation operations. Figure 1 is a schematic diagram of the principles of autocorrelation operations. The data stream 100 moves one bit along the data stream direction every sampling period, and the arrow indicates the data stream direction. The data stream 100 includes a number of data elements 101, 102 and 103 which are the latest data elements entering and the ones about to exit window A 106, respectively. 104 and 105 are the latest data elements entering and about to exit window B107 respectively. The result of the autocorrelation operation is the sum of the products of the conjugate values of the multiple data elements entering window A and the corresponding multiple data elements entering window B. When the data elements 102 and 104 are the same data, it becomes a special mode of autocorrelation operation, that is, the operation of calculating the average power of the data elements. Existing processors need to take two source operands each time when processing autocorrelation operations, then store the result of the multiplication into memory, and then add/subtract the previous autocorrelation result, so the operation is complicated and needs to be improved .

【发明内容】【Content of invention】

本发明要解决的主要技术问题是,提供一种处理器及一种自相关运算单元,简化自相关运算的复杂程度。The main technical problem to be solved by the present invention is to provide a processor and an autocorrelation computing unit to simplify the complexity of the autocorrelation computation.

为解决上述技术问题,本发明提供一种自相关运算单元,包括:相关数处理单元、乘法器、加法器、第二移位寄存器和寄存器,所述相关数处理单元包括第一移位寄存器,所述第一移位寄存器的输入端用于输入源操作数的数据元素,其输出端连接乘法器,所述乘法器的输入端分别用于输入源操作数的数据元素和相关数处理单元输出的结果,所述乘法器的输出端分别连接加法器和第二移位寄存器,所述加法器的输入端还分别连接第二移位寄存器的输出端和寄存器的输出端,所述加法器的输出端分别连接寄存器的输入端和用于输出目的操作数的数据元素。In order to solve the above-mentioned technical problems, the present invention provides an autocorrelation operation unit, comprising: a correlation number processing unit, a multiplier, an adder, a second shift register and a register, and the correlation number processing unit includes a first shift register, The input terminal of the first shift register is used to input the data elements of the source operand, and its output terminal is connected to a multiplier, and the input terminals of the multiplier are respectively used to input the data elements of the source operand and the output of the relevant number processing unit The result, the output end of the multiplier is connected to the adder and the second shift register respectively, the input end of the adder is also connected to the output end of the second shift register and the output end of the register respectively, and the output end of the adder The output terminal is respectively connected to the input terminal of the register and the data element for the output destination operand.

所述相关数处理单元还包括求共轭单元,所述求共轭单元的输入端连接第一移位寄存器的输出端,所述求共轭单元的输出端连接乘法器;或者所述共轭单元的输入端用于输入源操作数的数据元素,其输出端连接第一移位寄存器的输入端,所述第一移位寄存器的输出端连接乘法器。The correlation number processing unit also includes a conjugate unit, the input of the conjugate unit is connected to the output of the first shift register, and the output of the conjugate unit is connected to a multiplier; or the conjugate The input end of the unit is used to input the data elements of the source operand, the output end of the unit is connected to the input end of the first shift register, and the output end of the first shift register is connected to the multiplier.

根据本发明的另一方面,还提供一种处理器,包括算法数据控制部件、配置寄存器和至少一个算术逻辑运算单元,所述算术逻辑运算单元至少包括上述所述的用于执行自相关运算的自相关运算单元,所述算法数据控制部件与配置寄存器相连,配置寄存器与自相关运算单元相连,所述算法数据控制部件执行自相关操作指令,向所述配置寄存器发送第一配置信息,所述自相关运算单元根据第一配置信息对其自身的功能进行配置。According to another aspect of the present invention, a processor is also provided, including an algorithm data control unit, a configuration register, and at least one arithmetic and logic operation unit, and the arithmetic and logic operation unit includes at least the above-mentioned autocorrelation operation An autocorrelation operation unit, the algorithm data control unit is connected to a configuration register, and the configuration register is connected to an autocorrelation operation unit, and the algorithm data control unit executes an autocorrelation operation instruction, and sends first configuration information to the configuration register, and the configuration register is connected to the first configuration information. The autocorrelation computing unit configures its own functions according to the first configuration information.

本发明的有益效果是:处理器通过执行自相关操作指令,对自相关运算单元的功能进行配置,按照源操作数中的数据元素的顺序对源操作数依次进行读取,并通过自相关运算单元依次将源操作数中最新读取的一个数据元素与第一源操作数中先前读取的一个数据元素的共轭值进行复数相乘,得到一个复数的或实数的乘积结果;将这一乘积结果与源操作数中先前读取的一个数据元素所对应的乘积结果相减,并和源操作数中最新读取的一个数据元素的前一个数据元素所对应的自相关结果相加,得到源操作数中最新读取的一个数据元素所对应的目的操作数的一个自相关结果。并且依次将目的操作数的若干个数据元素输出,从而对自相关操作指令可只进行一次读取、译码和执行,简化自相关运算的复杂程度,同时可以缩小指令存储器大小,降低指令译码和配置操作的功耗。The beneficial effect of the present invention is that: the processor configures the function of the autocorrelation operation unit by executing the autocorrelation operation instruction, reads the source operands sequentially according to the order of the data elements in the source operands, and performs the autocorrelation operation The unit sequentially complex-multiplies the most recently read data element in the source operand with the conjugate value of the previously read data element in the first source operand to obtain a complex or real product result; The product result is subtracted from the product result corresponding to the previously read data element in the source operand, and added to the autocorrelation result corresponding to the previous data element of the latest read data element in the source operand to obtain An autocorrelation result of the destination operand corresponding to the most recently read data element in the source operand. And sequentially output several data elements of the destination operand, so that the autocorrelation operation instruction can be read, decoded and executed only once, which simplifies the complexity of the autocorrelation operation, and at the same time can reduce the size of the instruction memory and reduce the instruction decoding and power consumption for configuration operations.

【附图说明】【Description of drawings】

图1为自相关运算的原理图;Fig. 1 is the schematic diagram of autocorrelation operation;

图2A为本发明一种实施例的结构图;Figure 2A is a structural diagram of an embodiment of the present invention;

图2B为本发明另一种实施例的结构图;Fig. 2B is a structural diagram of another embodiment of the present invention;

图3A为本发明中自相关运算单元的一种实施例的结构图;Fig. 3 A is the structural diagram of an embodiment of the autocorrelation operation unit in the present invention;

图3B为本发明中自相关运算单元的另一种实施例的结构图;Fig. 3B is a structural diagram of another embodiment of the autocorrelation operation unit in the present invention;

图3C为本发明中自相关运算单元的又一种实施例的结构图。FIG. 3C is a structural diagram of another embodiment of the autocorrelation operation unit in the present invention.

【具体实施方式】【Detailed ways】

下面通过具体实施方式结合附图对本发明作进一步详细说明。The present invention will be further described in detail below through specific embodiments in conjunction with the accompanying drawings.

以下说明描述在处理装置、计算机或软件程序中执行自相关或求平均功率操作的一种技术的实施例。在以下描述中,阐述诸如处理器类型、微体系结构、启动机制等的大量具体细节,以提供对本发明的透彻理解。然而,本领域的技术人员会理解,没有这类具体描述细节,也可实施本发明。虽然参照数字信号处理器来描述以下实施例,但是,其它实施例适用于其它类型的集成电路和逻辑装置。The following description describes an embodiment of a technique for performing autocorrelation or averaging power operations in a processing device, computer, or software program. In the following description, numerous specific details are set forth, such as processor types, microarchitectures, boot mechanisms, etc., in order to provide a thorough understanding of the present invention. However, it will be understood by those skilled in the art that the present invention may be practiced without such specific details as described. Although the following embodiments are described with reference to digital signal processors, other embodiments are applicable to other types of integrated circuits and logic devices.

数据的处理如果按照数据信号处理的算法特征来进行无疑可以提高处理器的整体性能。例如在宽带通信领域,系统的基带处理部分普遍具有流水线或流处理的结构特征。每个处理模块接收若干具有同质数据元素的数据流,经过若干次重复的操作,生成若干具有同质数据元素的新的数据流。对于这种具有较高规律性数据的处理,采用较大粒度的执行单元对数据的快速处理是有利的。例如可以将进行上述一次重复的操作由处理器中的一个执行单元来处理。由于这个执行单元每次进行相同的处理,而存储源操作数的位置和寻址方法是相对固定的,因此每次重复的操作不需要重复读取和对指令进行译码。通过在数据流处理之前或者开始处理之时执行一次指令操作(可以通过一条指令或多条指令来完成)完成对执行单元的配置。执行单元最好是数据驱动的。这样,执行单元每接收一个数据就执行一次操作。可以在指令中指出完成一次任务需要的接收的数据个数,即源操作数包括的数据单元的数目。如果源操作数中的所有数据单元均处理完毕并完成结果输出,执行单元立即停止工作,等待下一次配置和任务。If the data processing is carried out according to the algorithm characteristics of data signal processing, the overall performance of the processor can undoubtedly be improved. For example, in the field of broadband communication, the baseband processing part of the system generally has the structural characteristics of pipeline or stream processing. Each processing module receives several data streams with homogeneous data elements, and generates several new data streams with homogeneous data elements through several repeated operations. For the processing of this kind of data with higher regularity, it is beneficial to use the execution unit with larger granularity to process the data quickly. For example, the above-mentioned repeated operation may be processed by an execution unit in the processor. Since this execution unit performs the same processing each time, and the location and addressing method of storing the source operand are relatively fixed, each repeated operation does not need to repeatedly read and decode instructions. The configuration of the execution unit is completed by executing an instruction operation (which can be completed by one instruction or multiple instructions) before data stream processing or when processing starts. Execution units are preferably data-driven. In this way, the execution unit executes an operation every time it receives a piece of data. The number of received data required to complete a task can be indicated in the instruction, that is, the number of data units included in the source operand. If all the data units in the source operand are processed and the results are output, the execution unit immediately stops working and waits for the next configuration and task.

本发明的实施例包括用于实现自相关运算的单元。图1为自相关操作的原理示意图。自相关操作结果是进入窗口A中的多个数据元素与对应的进入窗口B中的多个数据元素的共轭值的复数乘积之和,而前一次操作的自相关结果总是已知的,因此当前自相关操作实际上是前一次操作的自相关结果加上最新进入窗口A的数据元素102与最新进入窗口B的数据元素104的共轭值的复数乘积结果并减去即将移出窗口A的数据元素103和即将移出窗口B的数据元素105的共轭值的复数乘积结果。根据数据流的流向,数据流首先进入窗口A。当窗口B接收到数据流的第一个数据元素之前,可以将窗口B中的数据元素理解为零值,因此自相关值为零。当窗口B接收到数据流的第一个数据元素时,自相关值开始存在实际意义。而当数据流的第一个数据元素成为105时,自相关结果开始输出。因此,当数据流的数据元素进入窗口A而没有进入窗口B时,不需要执行自相关运算;当第一个数据元素进入窗口B时,即开始自相关运算,但保留自相关结果,不向外输出;当第一个数据元素移出窗口B之前,进行自相关操作,保留并输出自相关结果。之后,数据流每移动一个数据元素位,输出一个自相关结果。直到数据流的最后一个数据元素进入窗口A,此时输出最后一个自相关结果。Embodiments of the invention include means for implementing autocorrelation operations. Figure 1 is a schematic diagram of the principle of autocorrelation operation. The result of the autocorrelation operation is the sum of the complex product of multiple data elements entering window A and the corresponding conjugate values of multiple data elements entering window B, and the autocorrelation result of the previous operation is always known, Therefore, the current autocorrelation operation is actually the autocorrelation result of the previous operation plus the complex product result of the conjugate value of the latest data element 102 entering window A and the conjugate value of the latest entering window B data element 104 and subtracting the result of the complex number that will be moved out of window A The complex product result of data element 103 and the conjugate value of data element 105 to be shifted out of window B. According to the flow direction of the data stream, the data stream enters window A first. Before window B receives the first data element of the data stream, the data elements in window B can be understood as zero-valued, so the autocorrelation value is zero. When window B receives the first data element of the data stream, the autocorrelation values start to have real meaning. And when the first data element of the data stream becomes 105, the autocorrelation result starts outputting. Therefore, when the data elements of the data stream enter window A but not window B, there is no need to perform autocorrelation operation; when the first data element enters window B, the autocorrelation operation starts, but the autocorrelation result is kept and not sent to Outer output; when the first data element is moved out of window B, perform an autocorrelation operation, retain and output the autocorrelation result. Afterwards, the data stream outputs an autocorrelation result every time a data element bit is shifted. Until the last data element of the data stream enters window A, the last autocorrelation result is output at this time.

连续自相关操作的源操作数即是上述的数据流,其包括若干个数据元素。本发明的实施例即是用于对具有确定数据元素个数的源操作数进行自相关运算。The source operand of the continuous autocorrelation operation is the above-mentioned data stream, which includes several data elements. The embodiment of the present invention is used to perform autocorrelation operation on source operands with a certain number of data elements.

请参考图2A,在一种实施例中,完成自相关运算的处理器200包括算法数据控制部件(简称ADU)203、配置寄存器205和至少一个算术逻辑运算单元,算术逻辑运算单元至少包括一个用于执行自相关运算的自相关运算单元201,算法数据控制部件203与配置寄存器205相连,配置寄存器205与自相关运算单元201相连,自相关运算单元201还分别与源操作数输入源和目的操作数输出源相连,算法数据控制部件203执行配置指令,本实施例中是执行自相关操作指令,向配置寄存器205发送第一配置信息,自相关运算单元201根据第一配置信息对其自身的功能进行配置。Please refer to FIG. 2A, in one embodiment, the processor 200 that completes the autocorrelation operation includes an algorithm data control unit (abbreviated as ADU) 203, a configuration register 205, and at least one arithmetic logic operation unit, and the arithmetic logic operation unit includes at least one For the autocorrelation operation unit 201 that performs the autocorrelation operation, the algorithm data control part 203 is connected with the configuration register 205, and the configuration register 205 is connected with the autocorrelation operation unit 201, and the autocorrelation operation unit 201 is also respectively connected with the source operand input source and the destination operation The data output source is connected, and the algorithm data control unit 203 executes the configuration instruction. In this embodiment, the autocorrelation operation instruction is executed, and the first configuration information is sent to the configuration register 205. The autocorrelation operation unit 201 performs its own function according to the first configuration information. to configure.

在本实施例中算法数据控制部件203包括用于存储指令或数据的存储单元204和用于对指令进行译码的译码单元214,在其它实施例中,算法数据控制部件还可以是包括其它单元。本实施例中,算法数据控制部件所执行的配置指令包括操作码、配置信息和配置目的,操作码为规定指令所执行操作的命令码,配置信息为指令操作的对象,配置目的用于指定写入配置信息的配置寄存器,对于自相关操作指令,配置信息包括源操作数的数据元素个数、自相关运算的窗口长度和两个窗口之间的距离,在其它实施例中,配置信息还可以包括其它信息。算法数据控制部件203读取存储单元204中的自相关操作指令,通过译码单元214对指令进行译码,将译码得到的第一配置信息写入配置寄存器205,自相关运算单元201根据第一配置信息来进行功能配置,实现对自相关操作具体参数的设置,如源操作数包含的数据单元数目、自相关运算的窗口长度和自相关运算中两个窗口之间的距离。源操作数的数据元素可以直接来自于数字信号处理器的端口,也可以来自于数字信号处理器内部的寄存器或者数据存储器,即源操作数输入源可以是处理器的端口、内部寄存器或者数据存储器。类似地,目的操作数的数据元素可以存储到数字信号处理器的端口、内部寄存器或数据存储器中,即目的操作数输出源也可以是处理器的端口、内部寄存器或者数据存储器。In this embodiment, the algorithm data control unit 203 includes a storage unit 204 for storing instructions or data and a decoding unit 214 for decoding instructions. In other embodiments, the algorithm data control unit may also include other unit. In this embodiment, the configuration instruction executed by the algorithm data control unit includes an operation code, configuration information, and configuration purpose. The operation code is a command code specifying the operation performed by the instruction. The configuration information is the object of the instruction operation. The configuration register that enters the configuration information, for the autocorrelation operation instruction, the configuration information includes the number of data elements of the source operand, the window length of the autocorrelation operation, and the distance between the two windows. In other embodiments, the configuration information can also be Include additional information. The algorithm data control part 203 reads the autocorrelation operation instruction in the storage unit 204, decodes the instruction through the decoding unit 214, writes the first configuration information obtained by decoding into the configuration register 205, and the autocorrelation operation unit 201 according to the first configuration information A configuration information is used to configure the function, and realize the setting of specific parameters of the autocorrelation operation, such as the number of data units contained in the source operand, the window length of the autocorrelation operation, and the distance between two windows in the autocorrelation operation. The data elements of the source operand can come directly from the port of the digital signal processor, or from the internal register or data memory of the digital signal processor, that is, the input source of the source operand can be the port, internal register or data memory of the processor . Similarly, the data elements of the destination operand can be stored in the port, internal register or data memory of the digital signal processor, that is, the output source of the destination operand can also be the port, internal register or data memory of the processor.

处理器对源操作数执行单指令多数据输入多数据输出的自相关指令,对自相关运算单元的功能进行配置,按照源操作数中的数据元素的顺序对源操作数依次进行读取和执行,并且依次将目的操作数的若干个数据元素输出,从而对自相关操作指令可只进行一次读取、译码和执行,简化自相关运算的复杂程度。The processor executes the autocorrelation instruction of single instruction multiple data input multiple data output on the source operand, configures the function of the autocorrelation operation unit, and reads and executes the source operand in sequence according to the order of the data elements in the source operand , and sequentially output several data elements of the destination operand, so that the autocorrelation operation instruction can be read, decoded and executed only once, simplifying the complexity of the autocorrelation operation.

请参考图2B,在另一种实施例中,完成自相关运算的处理器200与上述实施例的主要区别是还包括互联逻辑单元206,配置寄存器205还与互联逻辑单元206相连,算法数据控制部件203还执行其他配置指令,向配置寄存器205发送第二配置信息,互联逻辑单元206根据第二配置信息对源操作数据的输入和目的操作数据的输出路径进行配置。Please refer to FIG. 2B. In another embodiment, the main difference between the processor 200 that completes the autocorrelation operation and the above-mentioned embodiment is that it also includes an interconnection logic unit 206, and the configuration register 205 is also connected with the interconnection logic unit 206. The algorithm data control The component 203 also executes other configuration instructions, sends the second configuration information to the configuration register 205, and the interconnection logic unit 206 configures the input path of the source operation data and the output path of the destination operation data according to the second configuration information.

本实施例中,处理器200包括多个执行数字信号处理的算术逻辑运算单元(ALU)202,对指令进行读取和译码、以产生功能和连接关系的配置信息的算法数据控制部件(ADU)203,存储经指令译码得到的配置和控制信息的配置寄存器(Config)205,负责对ALU和端口(Ports)连接关系、多个ALU之间连接关系和多个端口之间连接关系进行配置的互连逻辑单元206,以及负责与外部处理器单元间的总线209连接的端口208。ALU能完成的功能,包括但不限于:加法、减法、乘法、乘法累加、与、或、异或、左算术/逻辑移位、右算术/逻辑移位、比较、传送等算术运算和逻辑运算。例如执行自相关运算的ALU201能够完成较复杂的运算任务,可以经过配置实现一个包括多个数据元素的数据流的自相关运算处理。互连逻辑单元206包括用于多个ALU之间数据交换的寄存器(Reg)207。211和212分别为配置ALU功能和配置ALU与端口间连接关系的控制总线。210是ALU与互连逻辑单元206间的数据总线。213是端口208与互连逻辑206间的数据总线。In this embodiment, the processor 200 includes a plurality of arithmetic logic operation units (ALU) 202 for performing digital signal processing, and an algorithm data control unit (ADU) for reading and decoding instructions to generate configuration information of functions and connection relations. ) 203, a configuration register (Config) 205 that stores configuration and control information obtained through instruction decoding, and is responsible for configuring the connection relationship between the ALU and ports (Ports), the connection relationship between multiple ALUs, and the connection relationship between multiple ports The interconnect logic unit 206, and the port 208 responsible for the connection with the bus 209 between the external processor units. The functions that the ALU can complete include but are not limited to: addition, subtraction, multiplication, multiplication accumulation, and, or, XOR, left arithmetic/logic shift, right arithmetic/logic shift, comparison, transfer, etc. Arithmetic and logical operations . For example, the ALU 201 that performs autocorrelation calculations can complete complex calculation tasks, and can be configured to implement autocorrelation calculation processing of a data stream including multiple data elements. The interconnection logic unit 206 includes a register (Reg) 207 for data exchange between multiple ALUs. 211 and 212 are control buses for configuring ALU functions and configuring connections between ALUs and ports, respectively. 210 is a data bus between the ALU and interconnect logic unit 206 . 213 is a data bus between port 208 and interconnection logic 206 .

在本实施例中,自相关操作的流程是:ADU203读取并译码存储单元MEM204中的自相关指令,将译码得到的功能和连接关系的配置信息写入配置寄存器205;自相关ALU201和互连逻辑单元206根据配置寄存器205中的信息完成对功能和连接关系的配置,功能配置实现了对自相关操作具体参数的设置,如源操作数包含的数据单元数目、自相关运算的窗口长度和自相关运算中两个窗口之间的距离,连接关系配置实现了对自相关操作的源操作数和目的操作数的位置进行了设定。源操作数输入源可以是处理器的端口208、内部寄存器207和数据存储器中的至少一个,目的操作数输出源也可以是处理器的端口208、内部寄存器207和数据存储器中的至少一个,也可以是与源操作数输入源不同的端口或者寄存器。互联逻辑单元206根据第二配置信息对输入和输出路径进行选择。例如完成配置后,自相关ALU201就可以读取端口208或寄存器(Reg)207中的数据开始连续的自相关操作,并将自相关操作后的结果输出到端口208或寄存器(Reg)207中。In this embodiment, the flow process of the autocorrelation operation is: ADU203 reads and decodes the autocorrelation instruction in the memory unit MEM204, and writes the configuration information of the function and the connection relationship obtained by decoding into the configuration register 205; the autocorrelation ALU201 and The interconnection logic unit 206 completes the configuration of the function and connection relationship according to the information in the configuration register 205, and the function configuration realizes the setting of the specific parameters of the autocorrelation operation, such as the number of data units contained in the source operand and the window length of the autocorrelation operation and the distance between two windows in the autocorrelation operation, the connection relationship configuration realizes the setting of the position of the source operand and the destination operand of the autocorrelation operation. The source operand input source can be at least one of the port 208 of the processor, the internal register 207 and the data memory, and the destination operand output source can also be at least one of the port 208 of the processor, the internal register 207 and the data memory, or Can be a different port or register than the input source of the source operand. The interconnect logic unit 206 selects the input and output paths according to the second configuration information. For example, after the configuration is completed, the autocorrelation ALU201 can read the data in the port 208 or the register (Reg) 207 to start a continuous autocorrelation operation, and output the result of the autocorrelation operation to the port 208 or the register (Reg) 207 .

上述实施例中,处理器采用数据驱动模式,有数据就进行处理,并输出处理结果存入相应的端口或寄存器,没有则等待;等源操作数中的数据元素都被处理并且目的操作数中的数据元素都输出后,停止处理,并等待下次配置。In the above-mentioned embodiment, the processor adopts the data-driven mode, and if there is data, it will be processed, and the output processing result will be stored in the corresponding port or register. After all the data elements are output, stop processing and wait for the next configuration.

根据上述实施例的用于执行自相关运算的自相关运算单元包括相关数处理单元、乘法器、加法器、第二移位寄存器和寄存器。相关数处理单元的输入端用于输入源操作数的数据元素,乘法器的输入端分别用于输入源操作数的数据元素和相关数处理单元输出的结果,乘法器的输出端分别连接加法器和第二移位寄存器,加法器的输入端还分别连接第二移位寄存器的输出端和寄存器的输出端,加法器的输出端分别连接寄存器的输入端和用于输出目的操作数的数据元素。The autocorrelation operation unit for performing the autocorrelation operation according to the above-mentioned embodiments includes a correlation number processing unit, a multiplier, an adder, a second shift register, and a register. The input terminal of the correlation number processing unit is used to input the data elements of the source operand, the input terminals of the multiplier are respectively used to input the data elements of the source operand and the output result of the correlation number processing unit, and the output terminals of the multiplier are respectively connected to the adder and the second shift register, the input end of the adder is also respectively connected to the output end of the second shift register and the output end of the register, and the output end of the adder is respectively connected to the input end of the register and the data element for outputting the destination operand .

相关数处理单元用于确定与输入源操作数的数据元素进行相乘的相关数,输入源操作数的数据元素即为进入窗口A的第一个数据元素,当源操作数中的数据元素为实数时,与输入源操作数的数据元素进行相乘的相关数即为进入窗口B的第一个数据元素。当源操作数中的数据元素为复数时,与输入源操作数的数据元素进行相乘的相关数即为进入窗口B的第一个数据元素的共轭值。The correlation number processing unit is used to determine the correlation number to be multiplied with the data element of the input source operand. The data element of the input source operand is the first data element entering window A. When the data element in the source operand is For real numbers, the relevant number multiplied with the data elements of the input source operand is the first data element entering window B. When the data elements in the source operand are complex numbers, the correlation number multiplied with the data elements of the input source operand is the conjugate value of the first data element entering window B.

下面以源操作数中的数据元素为复数为例进行说明。The following takes the data elements in the source operand as complex numbers as an example for illustration.

图3A为根据上述实施例的自相关运算单元的一种实施例的逻辑框图。源操作数中的数据元素为复数,相关数处理单元包括第一移位寄存器(即源操作数数据元素移位寄存器)300和求共轭单元306,源操作数数据元素移位寄存器300的输入端用于输入源操作数的数据元素,其输出端连接求共轭单元306的输入端,求共轭单元306的输出端连接乘法器,本实施例中,乘法器为复数乘法器307,加法器为复数加法器308,具体包括两个复数加法器和一个复数减法器,第二移位寄存器为复数乘法结果移位寄存器301,寄存器为自相关结果寄存器309。输入源操作数数据元素一路送复数乘法器307,一路送源操作数数据元素移位寄存器300的左侧第一个寄存器位303。源操作数数据元素移位寄存器300存储源操作数或其中部分数据元素,每新接收源操作数的一个数据元素,源操作数数据元素移位寄存器300中的数据向某个方向移动一位,丢弃一个先前存储的源操作数的数据元素,同时将新接收的数据元素存入源操作数数据元素移位寄存器300中因移位而产生的空位寄存器单元。移位寄存器300的最后一个寄存器位302对应于图1中最新进入窗口B的数据元素104。因此移位寄存器300的有效长度为图1中分别进入两个窗口的第一个数据元素之间的距离,也就是窗口A和窗口B之间的距离。将源操作数移位寄存器中因移位而即将丢弃的一个源操作数的数据元素(即寄存器中的数据元素302)经求共轭单元306,并将其结果与最新接收的源操作数的一个数据元素进行复数乘法。复数乘法器307的输出结果一路送复数加法器308,一路送复数乘法结果移位寄存器301的左侧第一个寄存器位304。复数乘法结果移位寄存器301存储源操作数的部分数据元素所对应的复数乘积结果,每新产生一个复数乘积结果,复数乘法结果移位寄存器301中的数据向某个方向移动一位,丢弃一个先前存储的源操作数的数据元素所对应的复数乘积结果,同时将新产生一个复数乘积结果存入复数乘法结果移位寄存器301中因移位而产生的空位寄存器单元;复数乘法结果移位寄存器301的长度即为自相关操作的窗口长度,即窗口A或窗口B的长度,窗口A和窗口B的长度相等,305为最后一个寄存器位,其存储值对应于图1中数据流数据元素103与105的共轭值的乘积结果。复数加法器308接收三路输入数据,分别为前一次自相关值、当前复数乘法器结果和存储在寄存器305中的值。其中输入的存储在寄存器305中的值是经复数加法器308中的减法器做减法运算。复数加法器的输出结果一路存入相关值寄存器309,一路作为目的操作数的数据元素输出。相关值寄存器309存储源操作数中最新读取的一个数据元素的前一个数据元素所对应的自相关结果。FIG. 3A is a logic block diagram of an embodiment of the autocorrelation operation unit according to the above embodiment. The data elements in the source operand are complex numbers, and the relevant number processing unit includes a first shift register (i.e., a source operand data element shift register) 300 and a conjugate unit 306, and the input of the source operand data element shift register 300 End is used for the data element of input source operand, and its output end connects and asks the input end of conjugate unit 306, asks the output end of conjugate unit 306 to connect multiplier, in the present embodiment, multiplier is complex number multiplier 307, addition The second shift register is a complex multiplication result shift register 301, and the register is an autocorrelation result register 309. The input source operand data elements are sent to the complex multiplier 307 one way, and the source operand data elements are sent to the first register bit 303 on the left of the shift register 300 . The source operand data element shift register 300 stores the source operand or part of the data elements thereof, and each time a data element of the source operand is newly received, the data in the source operand data element shift register 300 moves one bit in a certain direction, A previously stored data element of the source operand is discarded, while a newly received data element is stored in the empty bit register location in the source operand data element shift register 300 resulting from the shift. The last register bit 302 of shift register 300 corresponds to the most recent data element 104 entering window B in FIG. 1 . Therefore, the effective length of the shift register 300 is the distance between the first data elements respectively entering the two windows in FIG. 1 , that is, the distance between window A and window B. The data element (i.e. the data element 302 in the register) of a source operand that is about to be discarded due to shifting in the source operand shift register is obtained through the conjugate unit 306, and the result is combined with the newly received source operand Complex multiplication by one data element. The output result of the complex multiplier 307 is sent to the complex adder 308 one way, and the first register bit 304 on the left side of the complex multiplication result shift register 301 is sent all the way. The complex multiplication result shift register 301 stores the complex multiplication results corresponding to some data elements of the source operands, and each time a complex multiplication result is newly generated, the data in the complex multiplication result shift register 301 moves one bit in a certain direction, and one bit is discarded. The complex multiplication result corresponding to the data element of the previously stored source operand, and simultaneously a new complex multiplication result is stored in the vacancy register unit generated by shifting in the complex multiplication result shift register 301; the complex multiplication result shift register The length of 301 is the window length of the autocorrelation operation, that is, the length of window A or window B, the length of window A and window B is equal, and 305 is the last register bit, and its storage value corresponds to the data flow data element 103 in Figure 1 The product result with the conjugate value of 105. The complex adder 308 receives three input data, which are the previous autocorrelation value, the current complex multiplier result and the value stored in the register 305 . The input value stored in the register 305 is subtracted by the subtractor in the complex adder 308 . The output result of the complex adder is stored in the correlation value register 309 one by one, and output as the data element of the destination operand. The correlation value register 309 stores the autocorrelation result corresponding to the previous data element of the latest read data element in the source operand.

两个移位寄存器每接收一个数据自动向指定方向移动一个寄存器位,本实施例中,两个移位寄存器每接收一个数据自动向右移动一个寄存器位。在其它实施例中,也可以定义两个移位寄存器每接收一个数据自动向左移动一个寄存器位。Each time the two shift registers receive one piece of data, they automatically move one register bit to the specified direction. In this embodiment, each time the two shift registers receive one piece of data, they automatically move one register bit to the right. In other embodiments, the two shift registers can also be defined to automatically shift one register bit to the left each time a piece of data is received.

本实施例的自相关操作运算的过程为:将源操作数中最新读取的一个数据元素与源操作数中先前读取的一个数据元素的共轭值进行复数相乘,得到一个复数的或实数的乘积结果;将这一乘积结果与源操作数中先前读取的一个数据元素所对应的乘积结果相减,并和源操作数中最新读取的一个数据元素的前一个数据元素所对应的自相关结果相加,得到源操作数中最新读取的一个数据元素所对应的目的操作数的一个自相关结果。The operation process of the autocorrelation operation in this embodiment is as follows: multiply the conjugate value of a data element newly read in the source operand by a conjugate value of a data element previously read in the source operand to obtain a complex OR Product result of a real number; subtracts this product result from the product result corresponding to the previously read data element of the source operand, and corresponding to the previous data element of the most recently read data element of the source operand The autocorrelation results are added together to obtain an autocorrelation result of the destination operand corresponding to the latest read data element in the source operand.

图3B为根据上述实施例的自相关运算单元的另一种实施例的逻辑框图。源操作数中的数据元素仍然为复数,与图3A实施例不同在于,求共轭单元306的位置和源操作数数据元素移位寄存器300的更换。第一移位寄存器为源操作数数据元素共轭值移位寄存器310,输入源操作数直接进行求共轭,然后存入源操作数数据元素共轭值移位寄存器310。移位寄存器310的第一个寄存器位和最后一个寄存器位分别是311和312。FIG. 3B is a logic block diagram of another embodiment of the autocorrelation operation unit according to the above embodiment. The data elements in the source operand are still complex numbers. The difference from the embodiment in FIG. 3A lies in the position of the conjugate unit 306 and the replacement of the source operand data element shift register 300 . The first shift register is the source operand data element conjugate value shift register 310 , the input source operand is directly conjugated, and then stored in the source operand data element conjugate value shift register 310 . The first register bit and the last register bit of shift register 310 are 311 and 312, respectively.

当源操作数中的数据元素为实数时,相关数处理单元可以不需要求共轭,所以相关数处理单元包括第一移位寄存器,第一移位寄存器的长度与图3A和图3B中的第一移位寄存器的长度相同,第一移位寄存器的输入端输入源操作数的数据元素,其输出端连接乘法器,将输入的数据元素进行移位后和当前输入的数据元素进行相乘。相应的,乘法器和加法器也分别为实数乘法器和实数加法器。When the data element in the source operand is a real number, the correlation number processing unit may not need to seek the conjugate, so the correlation number processing unit includes a first shift register, and the length of the first shift register is the same as that in Fig. 3A and Fig. 3B The length of the first shift register is the same, the input terminal of the first shift register inputs the data element of the source operand, and its output terminal is connected to the multiplier, and the input data element is shifted and then multiplied by the current input data element . Correspondingly, the multiplier and the adder are also real number multipliers and real number adders respectively.

对于上述实施例中的移位寄存器,其有效长度由指令译码得到的功能配置信息来确定。也就是说,自相关运算单元包括两个相对较长的移位寄存器,而在实际进行自相关操作时,根据功能配置信息分别确定源操作数数据元素移位寄存器300(或源操作数数据元素共轭值移位寄存器310)和复数乘法结果移位寄存器301的有效长度。如前面所述移位寄存器300(或310)的有效长度为自相关运算的两个窗口之间的距离,即图1中分别进入两个窗口的第一个数据元素之间的距离,移位寄存器301的长度即为自相关操作的窗口长度。For the shift register in the above embodiments, its effective length is determined by the functional configuration information obtained from instruction decoding. That is to say, the autocorrelation operation unit includes two relatively long shift registers, and when actually performing the autocorrelation operation, the source operand data element shift register 300 (or source operand data element shift register 300) is respectively determined according to the functional configuration information. The effective length of the conjugate value shift register 310) and the complex multiplication result shift register 301. As mentioned above, the effective length of the shift register 300 (or 310) is the distance between the two windows of the autocorrelation operation, that is, the distance between the first data elements entering the two windows respectively in Fig. 1, the shift The length of the register 301 is the window length of the autocorrelation operation.

当移位寄存器300(或310)的有效长度为零时,表明两个窗口重叠,相应操作也从自相关操作变为求功率操作。When the effective length of the shift register 300 (or 310 ) is zero, it indicates that the two windows overlap, and the corresponding operation changes from an autocorrelation operation to a power calculation operation.

图3C为求平均功率操作的算术逻辑单元的逻辑框图。当源操作数数据元素移位寄存器300或源操作数数据元素共轭值移位寄存器310的有效长度为零时,自相关运算单元可以用来求一定数目的数据元素的平均功率,假设是求窗口A中的数据元素的平均功率,每个元素的功率实际指其绝对值的平方,因此算法是对窗口A中的每个元素求绝对值的平方(可以通过数据本身和它的共轭值相乘而得到),然后根据窗口中的元素个数求平均值)。当求一定数目的数据元素的平均功率,在上述实施例的基础上需要添加求平均逻辑单元313。另外,求共轭单元306直接对源操作数最新读取的一个数据元素求共轭值,复数乘法器307将源操作数最新读取的一个数据元素及其共轭值进行复数乘法操作,得出一个实数值;复数乘法结果移位寄存器301存储源操作数的部分数据元素所对应的实数乘积结果,每新产生一个实数乘积结果,复数乘法结果移位寄存器中的数据向某个方向移动一位,丢弃一个先前存储的源操作数的数据元素所对应的实数乘积结果,同时将新产生一个实数乘积结果存入复数乘法结果移位寄存器中因移位而产生的空位寄存器单元。相关值寄存器309改为了功率值寄存器314,功率值寄存器314存储源操作数中最新读取的一个数据元素的前一个数据元素所对应的功率之和。复数加法器309改为了实数加法器315。实数加法器315将新产生的一个实数乘法结果与复数乘法结果移位寄存器中因移位而即将丢弃的一个实数乘积结果进行减法操作,并与功率值寄存器中存储的前一个数据元素所对应的功率值结果进行加法操作,得到源操作数最新读取的一个数据元素所对应的功率之和。求平均逻辑单元313对实数加法器求出的功率之和求平均,得到源操作数的部分数据元素的平均功率。由于实数加法器315输出为多个数据元素的功率之和,因此求平均逻辑单元313的功能是求出一定窗口长度内的数据元素的平均功率。最简单的做法是通过向右移位来实现。FIG. 3C is a logic block diagram of an ALU for averaging power operations. When the effective length of the source operand data element shift register 300 or the source operand data element conjugate value shift register 310 is zero, the autocorrelation operation unit can be used to find the average power of a certain number of data elements, assuming that The average power of the data elements in window A, the power of each element actually refers to the square of its absolute value, so the algorithm is to find the square of the absolute value of each element in window A (you can pass the data itself and its conjugate value multiplied together), and then averaged over the number of elements in the window). When calculating the average power of a certain number of data elements, an averaging logic unit 313 needs to be added on the basis of the above embodiments. In addition, the conjugate unit 306 directly calculates the conjugate value of a data element newly read from the source operand, and the complex multiplier 307 performs a complex multiplication operation on a data element newly read from the source operand and its conjugate value to obtain output a real value; the complex multiplication result shift register 301 stores the real multiplication result corresponding to some data elements of the source operand, and every time a new real number multiplication result is generated, the data in the complex multiplication result shift register moves a certain direction. bits, discard a previously stored real number product result corresponding to the data element of the source operand, and simultaneously store a new real number product result into the empty bit register unit generated by shifting in the complex multiplication result shift register. The correlation value register 309 is changed to a power value register 314, and the power value register 314 stores the sum of power corresponding to the previous data element of the latest read data element in the source operand. The complex number adder 309 is changed to a real number adder 315 . Real number adder 315 carries out subtraction operation with a real number multiplication result that is newly produced and a real number product result that is about to be discarded due to shifting in the complex multiplication result shift register, and the previous data element corresponding to the power value register. The power value results are added to obtain the sum of the power corresponding to the latest data element read by the source operand. The averaging logic unit 313 averages the sum of the powers obtained by the real number adder to obtain the average power of some data elements of the source operand. Since the output of the real number adder 315 is the sum of the powers of multiple data elements, the function of the averaging logic unit 313 is to find the average power of the data elements within a certain window length. The easiest way to do this is by shifting to the right.

本实施例的操作过程是:将源操作数中最新读取的一个数据元素与源操作数中最新读取的一个数据元素的共轭值进行复数相乘,得到一个实数的乘积结果;将这一乘积结果与源操作数中先前读取的一个数据元素所对应的乘积结果相减,并和源操作数中最新读取的一个数据元素的前一个数据元素所对应的功率之和相加,并求平均,得到源操作数中最新读取的一个数据元素所对应的目的操作数的一个平均功率。The operation process of this embodiment is: perform complex multiplication of a data element newly read in the source operand and the conjugate value of a data element newly read in the source operand to obtain a product result of a real number; A product result is subtracted from the product result corresponding to a previously read data element in the source operand, and is added to the sum of the power corresponding to the previous data element of the latest read data element in the source operand, And calculate the average to obtain an average power of the destination operand corresponding to the latest read data element in the source operand.

本发明的相同技术和理论可容易地应用到可获益于较高流水线吞吐量和改进性能的其它类型的电路或半导体装置。本发明的理论适用于执行数据操作的任何处理器和机器。但是,本发明不限于执行64位、32位或16位数据操作的处理器或机器。The same techniques and theories of the present invention can be readily applied to other types of circuits or semiconductor devices that can benefit from higher pipeline throughput and improved performance. The teachings of the invention apply to any processor or machine that performs data operations. However, the invention is not limited to processors or machines that perform 64-bit, 32-bit, or 16-bit data operations.

根据本发明的一个方面,提供了一种已在其中存储了指令的机器可读介质,所述指令在由机器执行时,使所述机器执行包括以下步骤的方法:连续读取具有多个复数或实数特性数据值的源操作数的数据元素;确定源操作数中新读取的多个数据元素与源操作数中先前或者新读取的多个数据元素之间的相关结果;存储所述相关结果,即连续存储多个具有复数特性或实数特性数据值的目的操作数的数据元素。According to one aspect of the present invention, there is provided a machine-readable medium having stored thereon instructions which, when executed by a machine, cause the machine to perform a method comprising: sequentially reading or a data element of a source operand of a real number characteristic data value; determine a correlation result between the newly read plurality of data elements in the source operand and the previously or newly read plurality of data elements in the source operand; store the Associative result, that is, a data element that sequentially stores a plurality of destination operands having data values of complex or real nature.

以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be assumed that the specific implementation of the present invention is limited to these descriptions. For those of ordinary skill in the technical field of the present invention, without departing from the concept of the present invention, some simple deduction or replacement can be made, which should be regarded as belonging to the protection scope of the present invention.

Claims (9)

1.一种自相关运算单元,其特征在于包括:相关数处理单元、乘法器、加法器、第二移位寄存器和寄存器,所述相关数处理单元包括第一移位寄存器,所述第一移位寄存器的输入端用于输入源操作数的数据元素,其输出端连接乘法器,所述乘法器的输入端分别用于输入源操作数的数据元素和相关数处理单元输出的结果,所述乘法器的输出端分别连接加法器和第二移位寄存器,所述加法器的输入端还分别连接第二移位寄存器的输出端和寄存器的输出端,所述加法器的输出端分别连接寄存器的输入端和用于输出目的操作数的数据元素。1. An autocorrelation operation unit is characterized in that comprising: a correlation number processing unit, a multiplier, an adder, a second shift register and a register, and the correlation number processing unit includes a first shift register, and the first The input end of the shift register is used to input the data elements of the source operand, and its output end is connected to a multiplier, and the input ends of the multiplier are respectively used to input the data elements of the source operand and the result output by the correlation number processing unit, so The output end of the multiplier is connected to the adder and the second shift register respectively, the input end of the adder is also connected to the output end of the second shift register and the output end of the register respectively, and the output end of the adder is connected respectively The input to the register and the data element for the output destination operand. 2.如权利要求1所述的自相关运算单元,其特征在于:所述相关数处理单元还包括求共轭单元,所述求共轭单元的输入端连接第一移位寄存器的输出端,所述求共轭单元的输出端连接乘法器;或者所述共轭单元的输入端用于输入源操作数的数据元素,其输出端连接第一移位寄存器的输入端,所述第一移位寄存器的输出端连接乘法器。2. autocorrelation operation unit as claimed in claim 1, is characterized in that: described correlation number processing unit also comprises seeking conjugation unit, and the input end of described conjugation unit is connected the output end of the first shift register, The output terminal of the conjugate unit is connected to the multiplier; or the input terminal of the conjugate unit is used to input the data element of the source operand, and the output terminal is connected to the input terminal of the first shift register, and the first shift register The output terminal of the bit register is connected to the multiplier. 3.如权利要求1或2所述的自相关运算单元,其特征在于:所述第一移位寄存器的长度为自相关运算的两个窗口之间的距离,所述第二移位寄存器的长度为自相关运算的窗口长度。3. autocorrelation computing unit as claimed in claim 1 or 2, is characterized in that: the length of described first shift register is the distance between two windows of autocorrelation operation, the length of described second shift register The length is the window length of the autocorrelation operation. 4.如权利要求3所述的自相关运算单元,其特征在于:所述第一移位寄存器和第二移位寄存器每接收一个数据自动向指定方向移动一个寄存器位。4. The autocorrelation operation unit according to claim 3, characterized in that: each time the first shift register and the second shift register receive a piece of data, they automatically shift one register bit in a specified direction. 5.如权利要求1或2所述的自相关运算单元,其特征在于:所述第一移位寄存器的长度为0,所述自相关运算单元还包括求平均逻辑单元,所述求平均逻辑单元的输入端连接加法器的输出端,输出端用于输出目的操作数数据元素。5. autocorrelation operation unit as claimed in claim 1 or 2, is characterized in that: the length of described first shift register is 0, and described autocorrelation operation unit also comprises averaging logic unit, and described averaging logic The input end of the unit is connected to the output end of the adder, and the output end is used to output the destination operand data element. 6.一种处理器,包括算法数据控制部件,其特征在于还包括:配置寄存器和至少一个算术逻辑运算单元,所述算术逻辑运算单元至少包括一个如权利要求1至5中任一项所述的用于执行自相关运算的自相关运算单元,所述算法数据控制部件与配置寄存器相连,配置寄存器与自相关运算单元相连,所述算法数据控制部件执行自相关操作指令,向所述配置寄存器发送第一配置信息,所述自相关运算单元根据第一配置信息对其自身的功能进行配置。6. A processor, comprising an algorithm data control unit, characterized in that it also includes: a configuration register and at least one arithmetic and logic operation unit, and the arithmetic and logic operation unit includes at least one according to any one of claims 1 to 5 An autocorrelation operation unit for performing autocorrelation operations, the algorithm data control part is connected to the configuration register, the configuration register is connected to the autocorrelation operation unit, the algorithm data control part executes the autocorrelation operation instruction, and sends the configuration register The first configuration information is sent, and the autocorrelation computing unit configures its own function according to the first configuration information. 7.如权利要求6所述的处理器,其特征在于:所述第一配置信息包括源操作数的数据元素个数、自相关运算的窗口长度和两个窗口之间的距离。7. The processor according to claim 6, wherein the first configuration information includes the number of data elements of the source operand, the window length of the autocorrelation operation, and the distance between two windows. 8.如权利要求6或7所述的处理器,其特征在于:还包括互联逻辑单元,所述配置寄存器还与互联逻辑单元相连,所述算法数据控制部件还执行配置指令,向所述配置寄存器发送第二配置信息,所述互联逻辑单元根据第二配置信息对源操作数据的输入和目的操作数据的输出路径进行配置。8. The processor according to claim 6 or 7, characterized in that: it also includes an interconnection logic unit, the configuration register is also connected to the interconnection logic unit, and the algorithm data control part also executes a configuration instruction to the configuration The register sends the second configuration information, and the interconnect logic unit configures the input path of the source operation data and the output path of the destination operation data according to the second configuration information. 9.如权利要求6所述的处理器,其特征在于:所述算法数据控制部件包括用于存储指令或数据的存储单元和用于对指令进行译码的译码单元。9. The processor according to claim 6, wherein the arithmetic data control unit comprises a storage unit for storing instructions or data and a decoding unit for decoding instructions.
CN2009101050581A 2009-01-14 2009-01-14 Self-correlated arithmetic unit and processor Expired - Fee Related CN101477456B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009101050581A CN101477456B (en) 2009-01-14 2009-01-14 Self-correlated arithmetic unit and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101050581A CN101477456B (en) 2009-01-14 2009-01-14 Self-correlated arithmetic unit and processor

Publications (2)

Publication Number Publication Date
CN101477456A CN101477456A (en) 2009-07-08
CN101477456B true CN101477456B (en) 2011-06-08

Family

ID=40838178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101050581A Expired - Fee Related CN101477456B (en) 2009-01-14 2009-01-14 Self-correlated arithmetic unit and processor

Country Status (1)

Country Link
CN (1) CN101477456B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8549264B2 (en) * 2009-12-22 2013-10-01 Intel Corporation Add instructions to add three source operands
CN110535847B (en) * 2019-08-23 2021-08-31 极芯通讯技术(南京)有限公司 Network processor and stack processing method of network data
CN111124492B (en) * 2019-12-16 2022-09-20 成都海光微电子技术有限公司 Instruction generation method and device, instruction execution method, processor and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101027864A (en) * 2004-09-24 2007-08-29 松下电器产业株式会社 Method for detecting symbol timing of multi-antenna radio communication system
CN101123477A (en) * 2006-07-28 2008-02-13 三星电机株式会社 Systems, nethods, and apparatuses for a long delay generation technique for spectrum-sensing of cognitive radios

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101027864A (en) * 2004-09-24 2007-08-29 松下电器产业株式会社 Method for detecting symbol timing of multi-antenna radio communication system
CN101123477A (en) * 2006-07-28 2008-02-13 三星电机株式会社 Systems, nethods, and apparatuses for a long delay generation technique for spectrum-sensing of cognitive radios

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JP昭62-55764A 1987.03.11
JP特开平11-68615A 1999.03.09

Also Published As

Publication number Publication date
CN101477456A (en) 2009-07-08

Similar Documents

Publication Publication Date Title
US10445451B2 (en) Processors, methods, and systems for a configurable spatial accelerator with performance, correctness, and power reduction features
CN110084361B (en) A computing device and method
US11086816B2 (en) Processors, methods, and systems for debugging a configurable spatial accelerator
US10515046B2 (en) Processors, methods, and systems with a configurable spatial accelerator
US10416999B2 (en) Processors, methods, and systems with a configurable spatial accelerator
US10469397B2 (en) Processors and methods with configurable network-based dataflow operator circuits
US10558575B2 (en) Processors, methods, and systems with a configurable spatial accelerator
CN111512292A (en) Apparatus, method and system for unstructured data flow in a configurable spatial accelerator
US8891757B2 (en) Programmable cryptographic integrated circuit
US20200210358A1 (en) Apparatuses, methods, and systems for in-network storage in a configurable spatial accelerator
US7873815B2 (en) Digital signal processors with configurable dual-MAC and dual-ALU
TW201030607A (en) Instruction and logic for performing range detection
Loi et al. High performance scalable elliptic curve cryptosystem processor for Koblitz curves
CN111027690B (en) Combined processing device, chip and method for performing deterministic inference
CN105335127A (en) Scalar operation unit structure supporting floating-point division method in GPDSP
CN102360281B (en) Multifunctional fixed-point media access control (MAC) operation device for microprocessor
US20120191766A1 (en) Multiplication of Complex Numbers Represented in Floating Point
US20090063606A1 (en) Methods and Apparatus for Single Stage Galois Field Operations
CN101477456B (en) Self-correlated arithmetic unit and processor
Pabbuleti et al. SIMD acceleration of modular arithmetic on contemporary embedded platforms
Chen et al. A high-performance unified-field reconfigurable cryptographic processor
US20060059221A1 (en) Multiply instructions for modular exponentiation
CN113485751B (en) Method for performing Galois field multiplication, arithmetic unit and electronic device
Galani Tina et al. Design and Implementation of 32-bit RISC Processor using Xilinx
KR20040041186A (en) Hyperelliptic curve crtpto processor hardware apparatus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: JI GANG

Free format text: FORMER OWNER: PEKING UNIVERSITY SHENZHEN GRADUATE SCHOOL

Effective date: 20120803

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 518055 SHENZHEN, GUANGDONG PROVINCE TO: 519015 ZHUHAI, GUANGDONG PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20120803

Address after: 519015 Guangdong Province, Zhuhai city Xiangzhou District Jiuzhou Jiuzhou Avenue East Lane 12, Room 401

Patentee after: Ji Gang

Address before: 518055 Guangdong city in Shenzhen Province, Nanshan District City Xili Shenzhen University North Campus

Patentee before: Shenzhen Graduate School of Peking University

ASS Succession or assignment of patent right

Owner name: BEIJING ANCE HENGXING INVESTMENT CO., LTD.

Free format text: FORMER OWNER: JI GANG

Effective date: 20120924

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 519015 ZHUHAI, GUANGDONG PROVINCE TO: 100142 HAIDIAN, BEIJING

TR01 Transfer of patent right

Effective date of registration: 20120924

Address after: 100142 Beijing city Haidian District enjizhuang District F No. 46 room 338

Patentee after: Beijing Hengxing Strategy Investment Limited

Address before: 519015 Guangdong Province, Zhuhai city Xiangzhou District Jiuzhou Jiuzhou Avenue East Lane 12, Room 401

Patentee before: Ji Gang

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110608

Termination date: 20200114