JPH0298777A

JPH0298777A - Parallel sum of product arithmetic circuit and vector matrix product arithmetic method

Info

Publication number: JPH0298777A
Application number: JP25105888A
Authority: JP
Inventors: Ichiro Tamiya; 一郎民谷
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-10-05
Filing date: 1988-10-05
Publication date: 1990-04-11

Abstract

PURPOSE:To accelerate a matrix arithmetic operation by operating sum of product arithmetic circuits in parallel by a value stored in a memory circuit in advance and a value to be supplied to an input bus. CONSTITUTION:The memory circuit 2 can write the value from the input bus 1, and read out different values from output ports p1-p4 simultaneously, and sum of computing elements 3-6 calculate the products of two input systems supplied from the input bus 1 and the output of the memory circuit 2, and accumulate them in internal accumulators. And the values accumulated in the sum of product arithmetic parts 3-6 are outputted to corresponding latch circuits 13-16, and only either one of buffer gates 23-26 is selected by control from the outside, and the contents of the latch circuits 13-16 are read out to the outside via an output bus 12. Thereby, a fast matrix arithmetic operation can be performed.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、ディジタル信号処理装置に用いられる演算回
路及び演算方法、特に並列積和演算回路及びそれを用い
た演算方法に関し、更に詳しくは、ベクトルと行列間の
積を高速に演算できる並列積和演算回路及びベクトル行
列積演算方法に関する。[Detailed Description of the Invention] [Industrial Application Field] The present invention relates to an arithmetic circuit and an arithmetic method used in a digital signal processing device, particularly a parallel product-sum arithmetic circuit and an arithmetic method using the same. The present invention relates to a parallel product-sum calculation circuit and a vector-matrix product calculation method that can calculate products between vectors and matrices at high speed.

[Conventional technology]

従来のディジタル信号処理プロセッサ等で採用されてい
る演算回路は、乗算器と累算器を内蔵し、パイプライン
処理により両者を並列動作させて積和演算を効率よく実
行することによって、例えば、音声帯域の信号処理を実
時間で実行している。The arithmetic circuits used in conventional digital signal processing processors have built-in multipliers and accumulators, and they operate in parallel using pipeline processing to efficiently execute product-sum operations. Band signal processing is performed in real time.

ところが、音声信号の約１０００倍の帯域を持つ画像信
号処理や画像生成等への応用では、実時間処理を行うた
めには、より高速な行列演算処理能力が必要となる。However, in applications such as image signal processing and image generation, which have a bandwidth about 1000 times that of an audio signal, higher-speed matrix calculation processing capacity is required in order to perform real-time processing.

そこで、高速化の方法として、演算回路の動作速度の向
上の他、乗算器と累算器を多数配置した並列処理の導入
が考えられる。実際、昨今のＬＳＩ技術の進展により、
並列乗算器等の比較的回路規模の大きい演算器を同一チ
ップ内に複数個搭載できるようになっている。Therefore, as a method for increasing the speed, in addition to improving the operating speed of the arithmetic circuit, it is possible to introduce parallel processing in which a large number of multipliers and accumulators are arranged. In fact, with recent advances in LSI technology,
It is now possible to mount a plurality of arithmetic units with relatively large circuit scales, such as parallel multipliers, on the same chip.

このような問題意識から、とりわけ、行列演算を並列処
理により高速計算する回路として、Ｈ，Ｔ。With this problem in mind, H and T are particularly popular as circuits that perform matrix calculations at high speed through parallel processing.

ＫＵＮＧらの提唱するシストリック・アレイ方式に基づ
いた回路の構成法が多数提案されている（例えば、ＩＢ
ＥＨのコンピュータ・マガジン誌１５巻（１）、　３７
〜４６ページ参照）。これらの行列演算回路は、行列積
等の個々の演算機能の並列度を最大限に引き出すととも
に、演算器間の配線長が最小になるように演算器を配置
したハードウェアド回路である。Many circuit configuration methods have been proposed based on the systolic array method proposed by KUNG et al. (for example, IB
EH Computer Magazine Volume 15 (1), 37
(See page 46). These matrix arithmetic circuits are hardware circuits in which arithmetic units are arranged so as to maximize the parallelism of individual arithmetic functions such as matrix multiplication and to minimize the wiring length between the arithmetic units.

[Problem to be solved by the invention]

しかしながら、これらの行列演算回路は、各々特定の演
算機能に最適化した単機能の回路構成となっているので
、汎用のディジタル信号処理プロセッサの演算部にその
まま適用することは難しい。However, since these matrix calculation circuits each have a single-function circuit configuration optimized for a specific calculation function, it is difficult to apply them directly to the calculation section of a general-purpose digital signal processing processor.

柔軟性を増すために、多数の演算器間にサブバス構成や
バススイッチを導入して、接続自由度を上げることもで
きるが、ハードウェア規模の増大をまねき、制御プログ
ラムも複雑化するため、使いにくい演算回路となるとい
う問題が生ずる。In order to increase flexibility, it is possible to introduce sub-bus configurations and bus switches between a large number of computing units to increase the degree of freedom in connection, but this increases the hardware scale and complicates the control program, making it difficult to use. A problem arises in that the arithmetic circuit becomes difficult to use.

本発明の目的は、画像信号処理１画像生成等の応用にも
適応できるように演算器を並列に配置して行列演算を高
速化する演算回路であって、しかも、汎用のディジタル
信号処理回路の演算部として好適なように、柔軟な演算
処理を単純な制御で実現する回路構成及び制御方法の並
列積和演算回路及びベクトル行列積演算方法を提供する
ことにある。An object of the present invention is to provide an arithmetic circuit that speeds up matrix operations by arranging arithmetic units in parallel so that it can be applied to applications such as image signal processing and image generation. It is an object of the present invention to provide a parallel product-sum calculation circuit and a vector-matrix product calculation method with a circuit configuration and control method that realize flexible calculation processing with simple control so as to be suitable as a calculation unit.

[Means to solve the problem]

本発明の並列積和演算回路は、入力バスと、異なるアドレスに格納された値を同時にＮ本の出力に読
み出せる記憶回路と、各々は、入力バスと記憶回路の出力１本に接続され、入
力バスと記憶回路から入力される２つの値を累積し記憶
するＮ個の積和演算器と、前記Ｎ個の積和演算器に各々
接続され、演算結果を取り込むＮ個のラッチ回路と、前記記憶回路の内容を更新する手段と、前記積和演算器
の記憶内容を初期化する手段と、前記Ｎ個のラッチ回路
の値を外部に読み出す手段とから構成されることを特徴
としている。The parallel product-sum operation circuit of the present invention includes an input bus, a memory circuit that can simultaneously read values stored at different addresses to N outputs, each of which is connected to the input bus and one output of the memory circuit, N product-sum calculators that accumulate and store two values input from an input bus and a storage circuit, and N latch circuits each connected to the N product-sum calculators to capture the calculation results; It is characterized by comprising means for updating the contents of the storage circuit, means for initializing the storage contents of the product-sum calculator, and means for reading the values of the N latch circuits to the outside.

本発明のベクトル行列積演算方法は、上記並列積和演算回路を用いて、Ｍ次元ベクトルＣ”　
（Ｃ＋　　Ｃｚ　　・”　ＣＭ）とＭ行Ｎ列の行列Ａ＝
　（ａｓｎ１ｍ＝１．２．−−−、Ｍ；ｎ＝ｌ。The vector-matrix product calculation method of the present invention uses the above-mentioned parallel product-sum calculation circuit to calculate the M-dimensional vector C''
(C+ Cz ・” CM) and M rows and N columns matrix A=
(asn1m=1.2.---, M; n=l.

２、・・・、Ｎ）のベクトル行列積Ｃ−Ａを計算し、Ｎ
次元ベクトルＢＣｅ＋ｅｚ　　・・・ｅＮ）を求めるベ
クトル行列積演算方法であって、前記記憶回路に前記行
列Ａの要素ａｌｌ１１の値を予め格納し、前記入力バス
には、前記入力ベクトルＣの要素Ｃ１を’　Ｉ、’　ｌ
＋・・・ｒ　　ｃＨの順に順次供給し、前記行列Ａを構
成する列ベクトルの要素を各々ａＩｎｎ　　”　Ｚｒｉ
＋　　・・・＋ａＭｒ＋の順に順次読み出し、前記Ｎ個
の積和演算器は、入力ベクトルＣとＮ個の列ベクトルと
の内積演算によりベクトルＥの要素ｅｎを計算し、求ま
った演算結果は、前記ラッチ回路に同時に取り込まれ、
しかる後に外部に読み出されることを特徴としている。2,...,N), calculate the vector matrix product C-A of N
This is a vector matrix product calculation method for obtaining a dimensional vector BCe+ez...eN), in which the values of elements all11 of the matrix A are stored in the storage circuit in advance, and the values of the elements C1 of the input vector C are stored in the input bus. 'I,' l
+...r cH, and the elements of the column vectors constituting the matrix A are each aInn ” Zri
+...+aMr+, the N product-sum calculators calculate the element en of the vector E by inner product operation of the input vector C and the N column vectors, and the calculated result is is simultaneously captured in the latch circuit,
It is characterized in that it is then read out to the outside.

また、本発明の並列積和演算回路は、積和演算器の機能には、外部からの制御により減算が選
択できる機能を含むことを特徴としている。Further, the parallel product-sum calculation circuit of the present invention is characterized in that the function of the product-sum calculation unit includes a function in which subtraction can be selected by external control.

また、本発明のベクトル行列積演算方法は、積和演算器
が減算機能を含む上記並列積和演算回路を用いて、Ｍ次
元の複素ベクトルＣ＝　（ｃ。Further, the vector matrix product calculation method of the present invention uses the above-mentioned parallel product-sum calculation circuit in which the product-sum calculation unit includes a subtraction function, and the M-dimensional complex vector C= (c).

＋ｊ（ＬＩｍ＝１．２．　　・・−、Ｍ）とＭ行Ｎ列の
複素行列Ａ＝（”ｍｅ＋Ｊ　ｂａｎ　Ｉ　ｍ””　！　
＋　　Ｌ・・１Ｍ；　ｎ−１，２，・・・、Ｎ）の複素
ベクトル行列積Ｃ−Ａを計算し、Ｎ次元の複素ベクトル
Ｅ＝　（ｅ、＋ｊｆｆｉｌｎ＝１．２．　　・−・、　
Ｎ）の実数部ｅｎと虚数部ｆｎを求めるベクトル行列積
演算方法であうで、前記記憶回路に予め前記行列への実数部ａｌｌ１１と虚
数部ｂａ１１を格納し、前記入力バスには、前記複素ベ
クトルＣの構成要素を’　ｌ＋　ｄＩｎ　Ｃ！、ｄＩｎ
・・・の順に順次供給し、前記積和演算器には、予め複
素ベクトルＥの実数部ｅＲもしくは虚数部の計算を割り
当て、実数部ｅｎの計算を担当する積和演算器には、前
記記憶回路から行列Ａの列ベクトルの構成要素をａＩ、
、＋　　ｂＩＩ’ｌ＋　　”　Ｅｎｒ　　ｂ　１ｎ＋　
　・・・ａｍｎｒｂＨｎの順に順次読み出し、前記積和
演算器の加算機能と減算機能を交互に用いて２Ｍ回の演
算でΣ（ａｉ。Ｃｉ　　”ｉ＋ｓｄｉ　）を計算し、虚
数部ｆｎの計算を担当する積和演算器では、行列へ〇列
ベクトルの構成要素をｂＩｎ＋　　ａＩｎ＋　　ｂ２ｎ
、３ｇ１１＋・・・、ｂ□＋ａＮ１１の順に順次読み出
して、２Ｍ回の積和演算によりΣ（ｂｉ、、Ｃ８＋ａｉ
ｄｉ）を計算し、出力ベクトルＥの第１要素から第Ｎ要
素までの２Ｎ個の数値が、各々対応する２Ｎ個の積和演
算器に同時に計算され、前記求められた内積値は、対応
するラッチ回路に同時に取り込んだ後外部に読み出され
ることを特徴としている。+j (LIm=1.2. . . -, M) and a complex matrix A of M rows and N columns = ("me+J ban I m""!
+L...1M; n-1,2,...,N) complex vector matrix product C-A is calculated, and N-dimensional complex vector E= (e, +jffiln=1.2....,
N) is a vector matrix product calculation method for obtaining the real part en and the imaginary part fn of the matrix, wherein the real part all11 and the imaginary part ba11 to the matrix are stored in the storage circuit in advance, and the input bus is used to store the complex vector C The components of ' l+ dIn C! , dIn
. . . are sequentially supplied in the order of . The components of the column vectors of matrix A from the circuit are aI,
, + bII'l+ ” Enr b 1n+
...Sequentially read out in the order of amnrbHn, calculate Σ(ai.Ci ``i+sdi) in 2M operations by alternately using the addition function and subtraction function of the product-sum calculation unit, and take charge of calculating the imaginary part fn. The product-sum calculator converts the components of the column vector into the matrix as bIn+aIn+b2n
, 3g11+..., b□+aN11 and 2M times of product-sum operations, Σ(bi, ,C8+ai
di), 2N numerical values from the first element to the Nth element of the output vector E are simultaneously calculated in the corresponding 2N product-sum calculators, and the calculated inner product value is It is characterized in that it is read out to the outside after being taken into the latch circuit at the same time.

[Effect]

本発明では、所要演算ステ・ノブ数を減らすために、出
力ベクトルの１要素の計算に積和演算器を割り当てる。In the present invention, in order to reduce the number of required calculation steps and knobs, a product-sum calculation unit is assigned to calculate one element of an output vector.

ベクトルＣと行列へのベクトル行列積は、ベクトルＣと
行列へ〇列ベクトルとの内積演算を求めることによって
計算され、各内積値がそのまま出力ベクトルの一要素と
なる。従って、複数の積和演算器を用意し、各々に出力
ベクトルの要素の計算を割り当てると、各要素の計算は
、他の積和演算器の出力を参照することなく計算を進め
られる。The vector-matrix product of the vector C and the matrix is calculated by calculating the inner product of the vector C and the column vector of the matrix, and each inner product value becomes one element of the output vector as it is. Therefore, if a plurality of product-sum calculators are prepared and calculation of the elements of the output vector is assigned to each one, each element can be calculated without referring to the outputs of other product-sum calculators.

このとき、各積和演算器が、所定の順に同時に計算を進
めれば、各積和演算器は同じベクトルＣの要素Ｃ７を参
照する。従って、本発明による並列積和演算回路では、
各積和演算器の２つの入力の中の一本は共通とし、ベク
トルＣの要素Ｃｉを所定の順に供給する。At this time, if each product-sum calculator simultaneously performs calculations in a predetermined order, each product-sum calculator refers to element C7 of the same vector C. Therefore, in the parallel product-sum operation circuit according to the present invention,
One of the two inputs of each product-sum calculator is made common, and elements Ci of the vector C are supplied in a predetermined order.

一方、行列Ａの要素は、列ベクトル毎に同時に各積和演
算器に供給するが、本発明による並列積和演算回路では
、行列Ａの要素は、Ｎ出力の記憶回路に予め記憶したも
のを外部から指定されたアドレスで同時に読み出すこと
により実現する。On the other hand, the elements of the matrix A are simultaneously supplied to each product-sum calculation unit for each column vector, but in the parallel product-sum calculation circuit according to the present invention, the elements of the matrix A are stored in advance in a storage circuit with N outputs. This is achieved by simultaneously reading at externally specified addresses.

このように、予め記憶回路に格納した値と、入力バスに
供給する値を定めることにより積和演算回路を並列に動
作させてベクトル行列積を求めることができ、マイクロ
プログラム等で記憶回路の読み出しアドレスと入力バス
へのデータ供給、及び積和演算器の初期化タイミングを
本発明によるベクトル行列積演算方法に基づいて制御す
ることにより、扱える行列の次元数の制約や演算効率の
劣化の無いベクトル行列積演算を可能とする。In this way, by determining the value stored in the memory circuit in advance and the value supplied to the input bus, the product-sum operation circuit can be operated in parallel to obtain the vector matrix product, and the memory circuit can be read out using a microprogram, etc. By controlling the address and data supply to the input bus and the initialization timing of the product-sum calculation unit based on the vector matrix product calculation method according to the present invention, vectors without restrictions on the number of dimensions of the matrix that can be handled or deterioration in calculation efficiency. Enables matrix multiplication operations.

また、本発明は、複素ベクトル演算を可能とする。複素
演算も扱えるよう積和演算器には、減算機能をも具備さ
せる。更に、使用積和演算器には、予め出力複素ベクト
ルの実数部もしくは虚数部の計算を割り当てるようにし
、入力バスには、外部からの制御によって複素ベクトル
Ｃの構成要素を実数部、虚数部交互に供給し、上述の減
算機能を複素乗算の計算式に従って供給するようにする
ことによって、複素演算を実現させている。The present invention also enables complex vector operations. The product-sum calculator is also equipped with a subtraction function so that it can also handle complex operations. Furthermore, the product-sum calculator used is assigned in advance to calculate the real part or imaginary part of the output complex vector, and the input bus is used to alternately calculate the real part and imaginary part of the complex vector C under external control. The complex operation is realized by supplying the subtraction function described above according to the calculation formula of complex multiplication.

〔Example〕

次に、本発明の実施例について図面を参照して説明する
。Next, embodiments of the present invention will be described with reference to the drawings.

第１図は、第１の発明による並列積和演算回路の一実施
例である。第１図に示すように、本実施例回路は、入力
バス１と、４つの出力を持つ記憶回路２と、積和演算器
３．４，５．６を有し、各積和演算器の詳細は第２図に
示されている。更に、積和演算器３，４，５．６の出力
をラッチするラッチ回路１３．１４．１５．１６と、ラ
ッチ回路に格納されたイ直を出力バス１２に出力するバ
ッファゲート２３゜２４、２５．２６を有する。FIG. 1 shows an embodiment of a parallel product-sum calculation circuit according to the first invention. As shown in FIG. 1, the circuit of this embodiment has an input bus 1, a memory circuit 2 having four outputs, and sum-of-products operating units 3.4 and 5.6. Details are shown in FIG. Furthermore, latch circuits 13, 14, 15, and 16 that latch the outputs of the product-sum calculators 3, 4, and 5.6, and buffer gates 23, 24, and 24 that output the I values stored in the latch circuits to the output bus 12, It has 25.26.

記憶回路２は、入力バス１から値を書き込める１６ワー
ドの４出力型記憶回路であり、各々４ビツトの独立した
４つのアドレスをアドレス端子１１に与え、出力ボート
ｐｔ、ｐ２．ｐ３．ｐ４から異なる４つの値を同時に読
み出すことができる。The memory circuit 2 is a 16-word, 4-output type memory circuit into which values can be written from the input bus 1, and provides four independent addresses of 4 bits each to the address terminal 11, and outputs ports pt, p2, . p3. Four different values can be read simultaneously from p4.

積和演算器３，４，５．６は、入力バス１と記憶回路２
の出力に供給される２つの入力系列の積を計算し、内部
のアキュムレータに累算する。The product-sum calculators 3, 4, 5.6 are connected to the input bus 1 and the memory circuit 2.
Computes the product of the two input sequences fed to the output of and accumulates it in an internal accumulator.

積和演算器３，４，５．６は、全て同じ構成で゛あり、
その詳細は第２図に示されている。ｉ２図に示すように
、積和演算器は、２つの入力ポートに供給される値の積
を求める乗算器３１と、乗算器３１の出力とアキュムレ
ータ３３の値を加算する加算器３２を備える。加算器３
２は、外部からの制御により、アキュムレータ例の入力
を“０”として乗算器３１の出力をそのまま出力するこ
ともできる。この積和演算器では、２つの入力端子名々
に入力系列が同時に与えられるとき、最初の値の積をア
キュムレータ３３にそのままロードし、以下、２番目以
降の積は、アキュムレータ３３の値と乗算器３１の出力
を加えあわせて再度格納する。この演算を必要回数繰り
返すことによって任意長のベクトル間の内積値を計算で
きる。The product-sum calculators 3, 4, 5.6 all have the same configuration,
The details are shown in FIG. As shown in FIG. i2, the product-sum calculator includes a multiplier 31 that calculates the product of values supplied to two input ports, and an adder 32 that adds the output of the multiplier 31 and the value of the accumulator 33. Adder 3
2 can also output the output of the multiplier 31 as it is by setting the input of the accumulator to "0" under external control. In this product-sum calculator, when input series are given to two input terminals at the same time, the product of the first value is loaded directly into the accumulator 33, and the second and subsequent products are multiplied by the value of the accumulator 33. The outputs of the device 31 are added together and stored again. By repeating this operation a necessary number of times, the inner product value between vectors of arbitrary length can be calculated.

各積和演算器に累算された値は、対応するラッチ回路１
３．１４．１５．１６に出力される。ラッチ回路１３、
１４．１５．１６は、クロック端子に取り込み信号が加
わると各々対応する積和演算器の出力を同時に取り込む
、バッフアゲ−）２３．２４．２５．２６は、外部から
の制御によりいずれか１個のみが選択され、ラッチ回路
の内容が出力バス１２を介して外部に読み出される。The value accumulated in each product-sum calculator is stored in the corresponding latch circuit 1.
Output on 3.14.15.16. latch circuit 13,
14, 15, and 16 are buffers that simultaneously capture the outputs of the corresponding multiply-accumulators when a capture signal is applied to the clock terminal.) 23, 24, 25, and 26 capture only one of them by external control. is selected, and the contents of the latch circuit are read out via the output bus 12.

このように、第１の発明に従う第１図の並列積和演算回
路は、入力バス１と、Ｎ個（本実施例では４個）までの
異なるアドレスに格納された値を同時に読み出せる記憶
回路２と、各々は、入力バス１と記憶回路２の１本の出
力に接続され、入力バス１と記憶回路２から入力される
２つの値の積を累算し記憶するＮ個の積和演算器３〜６
と、Ｎ個の積和演算器３〜６に各々に接続され積和演算
結果を取り込むＮ個のラッチ回路１３〜１６と、記憶回
路２の内容を更新する手段と、積和演算器３〜６の記憶
内容を初期化する手段と、Ｎ個のラッチ回路１３〜１６
の値を外部に読み出す手段とから構成される。As described above, the parallel product-sum operation circuit of FIG. 1 according to the first invention has an input bus 1 and a memory circuit that can simultaneously read values stored at up to N (four in this embodiment) different addresses. 2, each of which is connected to one output of the input bus 1 and the memory circuit 2, and N product-sum operations that accumulate and store the product of two values input from the input bus 1 and the memory circuit 2. Vessels 3-6
, N latch circuits 13 to 16 that are connected to the N product-sum calculators 3 to 6 and take in the product-sum calculation results, means for updating the contents of the storage circuit 2, and product-sum calculators 3 to 6. 6, and N latch circuits 13 to 16.
and means for reading out the value of.

上記並列積和演算回路並びに演算方法は、下記のような
原理に基づくものである。The parallel product-sum calculation circuit and calculation method described above are based on the following principle.

まず、第１図に示した本発明による積和演算回路とベク
トル行列積演算方法を、〔１〕式に示すベクトル行列積
演算を例に説明する。First, the product-sum calculation circuit and vector-matrix product calculation method according to the present invention shown in FIG. 1 will be explained using the vector-matrix product calculation shown in equation [1] as an example.

〔１〕式の計算には、全体では１６回の積和演算が必要
なため、例えば、１個の積和演算回路を持つＩチフブ信
号処理プロセッサで演算すると、約１６ステツプを要す
る。ところが、４つの積和演算器を用いれば、演算回数
を最高４ステツプまで減らすことができるはずである。The calculation of equation [1] requires a total of 16 product-sum calculations, and therefore, for example, approximately 16 steps are required when the calculation is performed using an I-chip signal processing processor having one product-sum calculation circuit. However, if four product-sum calculation units are used, it should be possible to reduce the number of calculations to a maximum of four steps.

本発明による演算方法では、以下に述べるように、所要
演算ステップ数を減らすために、出力ベクトルの１要素
の計算に積和演算器を割り当てる。In the calculation method according to the present invention, as described below, in order to reduce the number of required calculation steps, a product-sum calculation unit is assigned to calculate one element of the output vector.

ベクトルＣと行列Ａのベクトル行列積は、ベクトルＣと
行列へ〇列ベクトルとの内積演算を求めることによって
計算され、各内積値がそのまま出力ベクトルの一要素と
なる。従って、４つの積和演算器３〜６を用意し、各々
に出力ベクトルの要素の計算を割り当てると、各要素の
計算は、他の積和演算器の出力を参照することなく計算
を進められる。このとき、各積和演算器３〜６が、〔１
〕式内の添え字ｉ＝１から４の順に同時に計算を進めれ
ば、各積和演算器３〜６は同じベクトルＣの要素Ｃ１を
参照する。従って、本発明による並列積和演算回路では
、各積和演算３３〜６の２つの入力の中の一本は共通と
し、ベクトルＣの要素ｃ、をｉ＝１から４の順に供給す
る。The vector-matrix product of the vector C and the matrix A is calculated by calculating the inner product of the vector C and the column vector of the matrix, and each inner product value becomes one element of the output vector as it is. Therefore, if four product-sum calculators 3 to 6 are prepared and each is assigned the calculation of an element of the output vector, each element can be calculated without referring to the output of other product-sum calculators. . At this time, each product-sum calculation unit 3 to 6 has [1
] If calculations are performed simultaneously in the order of subscripts i=1 to 4 in the equation, each product-sum calculator 3 to 6 refers to element C1 of the same vector C. Therefore, in the parallel product-sum calculation circuit according to the present invention, one of the two inputs of each product-sum calculation 33 to 6 is made common, and elements c of the vector C are supplied in the order of i=1 to 4.

一方、行列への要素は、列ベクトル毎に同時に各積和演
算器に供給するが、本発明による並列積和演算回路では
、行列Ａの要素は、Ｎ出力の記ｔα回路２に予め記憶し
たものを外部から指定されたアドレスで同時に読み出す
ことにより実現する。On the other hand, the elements of the matrix are simultaneously supplied to each product-sum calculation unit for each column vector, but in the parallel product-sum calculation circuit according to the present invention, the elements of the matrix A are stored in advance in the notation tα circuit 2 with N outputs. This is achieved by simultaneously reading out the objects at externally specified addresses.

例えば、ベクトル行列積で表される離散コサイン変換で
は、行列の要素は通常定数で変更はないので処理に先立
って係数を記憶回路２に一度書き込めばよい。この他の
適応信号処理応用でも、行列の要素値は変更頻度が低い
ので、行列の要素値の更新は、必ずしも高速である必要
はない。For example, in discrete cosine transformation expressed by vector matrix product, the elements of the matrix are usually constants and do not change, so it is sufficient to write the coefficients once in the storage circuit 2 before processing. In other adaptive signal processing applications as well, matrix element values do not change frequently, so updating of matrix element values does not necessarily have to be fast.

更に、Ｋ次元の離散コサイン変換の例では、行列はに２
個の係数を持つものの、値としては２に種類しかないの
で記憶容量は、２にワードでよい。Furthermore, in the example of a K-dimensional discrete cosine transform, the matrix is
Although it has 2 coefficients, since there are only 2 types of values, the storage capacity only needs to be 2 words.

このように、行列要素の値の重複を利用して記憶回路を
存効利用できるという利点がある。In this way, there is an advantage that the storage circuit can be effectively utilized by utilizing the duplication of matrix element values.

出力ベクトルの次元数が積和演算器数Ｎ以下のときは、
上述のようにして入力ベクトルの次元数にあたるＭステ
ップで求まる。出力ベクトルの次元数がＮより大きいと
きには、出力ベクトルの要素をＮ個求めた後に、入力バ
スにベクトルＣの要素を再度与え、記憶回路２からは、
行列Ａの残りの列ベクトルを与えて、次のＮ個の出力ベ
クトル要素を計算する。このとき、最初のＮ個の演算結
果が求まったら、ただちに、積和演算器内のアキュムレ
ータ３３からラッチ回路１３〜１６に取り込むと、続け
て次の計算のために０１を入力バスに供給して、積和演
算器を休みなく使用できる。When the number of dimensions of the output vector is less than or equal to the number of product-sum calculation units,
As described above, it is determined in M steps corresponding to the number of dimensions of the input vector. When the number of dimensions of the output vector is greater than N, after obtaining N elements of the output vector, the elements of the vector C are given again to the input bus, and from the storage circuit 2,
Given the remaining column vectors of matrix A, compute the next N output vector elements. At this time, when the first N calculation results are obtained, they are immediately fetched from the accumulator 33 in the product-sum calculation unit to the latch circuits 13 to 16, and then 01 is supplied to the input bus for the next calculation. , the product-sum calculator can be used non-stop.

以上のように、本発明による並列積和演算回路では、マ
イクロプログラム等で記憶回路２の読み出しアドレスと
入力バスへのデータ供給、及び、積和演算器３〜６の初
期化タイミングを本発明によるベクトル行列積演算方法
に基づいて制御することにより、扱える行列の次元数の
制約や演算効率の劣化のないベクトル行列積演算を可能
とする。As described above, in the parallel product-sum calculation circuit according to the present invention, the read address of the memory circuit 2 and data supply to the input bus, and the initialization timing of the product-sum calculation units 3 to 6 are controlled by a microprogram or the like according to the present invention. Control based on the vector-matrix product calculation method enables vector-matrix product calculations without restrictions on the number of dimensions of the matrices that can be handled and without deterioration in calculation efficiency.

このように、第２の発明であるベクトル行列積演算方法
は、第１の発明による演算回路を用いて、Ｍ次元のベク
トルＣとＭ行Ｎ列の行列Ａのベクトル行列積Ｃ−Ａを計
算する方法であって、記憶回路２に予め前記行列Ａの要
素を格納し、入力バス１には、入力ベクトルＣの要素を
順次供給し、記憶回路２のＮ本の出力からは、前記行列
Ａを構成する列ベクトルの要素を各々順次読み出すこと
によって、Ｎ個の積和演算器３〜６が入力ベクトルＣと
Ｎ個の列ベクトル各々との内積をＭ回の積和演算により
求め、内偵値は、同時に対応するラッ子回路１３〜１６
に取り込まれ、しかる後に外部に読み出されるようにし
たものである。As described above, the vector-matrix product calculation method of the second invention uses the calculation circuit of the first invention to calculate the vector-matrix product C-A of the M-dimensional vector C and the M-by-N matrix A. In this method, the elements of the matrix A are stored in advance in the memory circuit 2, the elements of the input vector C are sequentially supplied to the input bus 1, and from the N outputs of the memory circuit 2, the elements of the matrix A are stored in the memory circuit 2 in advance. By sequentially reading out the elements of the column vectors constituting C, N product-sum calculators 3 to 6 calculate the inner product of the input vector C and each of the N column vectors by performing M product-sum calculations, and obtain the inner value. are corresponding lattice circuits 13 to 16 at the same time.
The data is imported into the system, and then read out to the outside.

以下、第１図の並列積和演算回路によるベクトル行列積
演算方法を第３図をも参照して具体的に説明する。Hereinafter, a vector matrix product calculation method using the parallel product-sum calculation circuit shown in FIG. 1 will be specifically explained with reference to FIG. 3 as well.

説明にあたっては、一般性を損なうことなく上述の（１
，）式の計算例を用いる。先ず、処理に先立って〔１〕
式の行列の要素ａｍｎを入力バス１を介して記憶回路２
に書き込む。以下、第３図に示すように、〔１〕式の添
え字ｉについてｉ＝１から４までの４ステツプで第１図
の入カバスエ、記憶回路２の出力ポートｐｉ、ｐ２．ｐ
３．ｐ４゜クロック端子１７に与え演算を実行する。In the explanation, without loss of generality, the above (1)
, ) calculation example is used. First, before processing [1]
The element amn of the matrix of the formula is sent to the storage circuit 2 via the input bus 1.
write to. Hereinafter, as shown in FIG. 3, for the subscript i in equation [1], four steps from i=1 to 4 are performed to input the input ports of FIG. p
3. p4° is applied to the clock terminal 17 to execute the calculation.

ｉ＝１では、入力バス１には、ベクトルＣの要素ｃ、を
供給し、記憶回路２の出力ｐ１．ｐ２゜ｐ３．ｐ４には
一各々ａ　ｌｌ＋　　ａ　ｌｔ＋　　ａ　ｌｌ＋　　２
１４を読み出す、このために、アドレス端子１１には、
ａｌｌ。When i=1, the input bus 1 is supplied with the element c of the vector C, and the output p1. of the storage circuit 2 is supplied. p2゜p3. p4 has one each a ll+ a lt+ a ll+ 2
14. For this purpose, the address terminal 11 is
all.

ａＩｚ、ａ目、ａ目各々のアドレスを同時に供給する。The addresses of aIz, a-th, and a-th are supplied simultaneously.

積和演算器３では、乗算器３１が、記憶回路２の出力ポ
ートｐｌに読み出された値ａｌｌとｃ、の積を出力する
。また、加算器は、乗算器の出力をそのまま出力し、Ｃ
１とａｌｌの積ＣＩａｌｌを積和演算器のアキュムレー
タ３３に格納する。積和演算器４．５．６でも積和演算
器３と同様の動作を同時に行い、積和演算器４．５．６
のアキュムレータには、積ＣＩａ１２＋　　’１ａ１３
．’Ｉａ１４が同時に格納される。In the product-sum calculator 3, the multiplier 31 outputs the product of the read value all and c to the output port pl of the storage circuit 2. Also, the adder outputs the output of the multiplier as is, and C
The product CIall of 1 and all is stored in the accumulator 33 of the product-sum calculator. The product-sum calculator 4.5.6 simultaneously performs the same operation as the product-sum calculator 3, and the product-sum calculator 4.5.6
The accumulator has the product CIa12+'1a13
．． 'Ia14 is stored at the same time.

ｉ＝２では、Ｃ２が入力バス１に供給され、積和演算器
３では、乗算器３１が０２とｐｌに読み出された値ａｘ
＋の積を出力する。また、加算器３２は、乗算器３１の
出力と、アキュムレータ３３の値Ｃ１ａｌｌを加算し、
加算結果をアキュムレータ３３に格納する。従って、積
和演算回路３のアキュムレータ３３には、ＣＩａｌｌ＋
Ｃ！ａ！＋が格納される。積和演算器４，５．６でも積
和演算器３と同様の動作を同時に行い、積和演算器４．
５．６のアキュムレータには％　’Ｉａｌ！＋Ｃ！ａ！
ｚ、’Ｉａ１２”Ｚａ！！＋ＣＩａ１４＋Ｃｔａｔ４が
各々同時に格納される。When i=2, C2 is supplied to the input bus 1, and in the product-sum calculator 3, the multiplier 31 outputs the value ax read to 02 and pl.
Outputs the product of +. Further, the adder 32 adds the output of the multiplier 31 and the value C1all of the accumulator 33,
The addition result is stored in the accumulator 33. Therefore, the accumulator 33 of the product-sum operation circuit 3 has CIall+
C! a! + is stored. The product-sum calculators 4, 5.6 simultaneously perform the same operations as the product-sum calculator 3, and the product-sum calculator 4.
5.6 accumulator has % 'Ial! +C! a!
z, 'Ia12''Za!!+CIa14+Ctat4 are stored simultaneously.

以下、ｉ＝３．４についても、各積和演算器３゜４．５
．６は、ｉ−２のときと同じ動作を繰り返す、その結果
、ｉ、ｘ４では、積和演算器３．４゜５．６のアキュム
レータには、各々 Σ　　Ｃｉ　　ａ　ｉｌ＋　　　　Σ　　ｃｔａ＝ｚ、
　　　　Σ　　Ｃｉ”ｉｌ！＠Ｉ　　　　　　　　　　
　　　　！−絋鉱　Ｃ４ａｉ。Below, for i=3.4, each product-sum operator 3°4.5
．． 6 repeats the same operation as for i-2. As a result, for i and x4, the accumulators of the product-sum calculator 3.4゜5.6 have Σ Ci a il+ Σ cta = z, respectively.
Σ Ci”il!@I
! -Ki Mine C4ai.

が格納され、〔１〕式の要素が各積和演算器のアキュム
レータに求まる。求まった４つの値は、クロック端子１
７に“１”を供給して、ラッチ回路１３゜１４、１５．
１６に格納され、外部からの制御により出力バス１２へ
読み出し可能となる。is stored, and the elements of equation [1] are found in the accumulator of each product-sum calculator. The four values found are the clock terminal 1.
7 is supplied with "1", and the latch circuits 13, 14, 15.
16, and can be read out to the output bus 12 under external control.

次に、第３の発明による並列積和演算回路の一実施例を
第４図に示す、第４図に示すように、本実施例回路は、
入力バス１と、記憶回路２と、積和演算器３０．　４．
５０．　６と、積和演算器３０，４゜５０．６の出力を
ラッチするラッチ回路１３．１４．１５゜１６と、ラッ
チ回路１３〜１６に格納されたデータを出力バス１２に
出力するバッファゲート２３．２４．２５゜２６と、積
和演算器３０．５０を制御するための制御端子１８を有
する。第４図の並列積和演算回路は、積和演算器３０．
５０及び制御端子１８を除き、第１図。Next, an embodiment of the parallel product-sum operation circuit according to the third invention is shown in FIG. 4.As shown in FIG.
An input bus 1, a memory circuit 2, and a product-sum calculator 30. 4.
50. 6, a latch circuit 13, 14, 15, 16 that latches the output of the product-sum calculator 30, 4, 50.6, and a buffer gate 23 that outputs the data stored in the latch circuits 13 to 16 to the output bus 12. .24, 25° 26, and a control terminal 18 for controlling the product-sum calculator 30, 50. The parallel product-sum calculation circuit shown in FIG. 4 includes a product-sum calculation unit 30.
1, excluding 50 and control terminal 18.

第２図に示した実施例と同じ構成、同じ機能を持つ。It has the same configuration and functions as the embodiment shown in FIG.

積和演算器３０．５０の累算機能には、外部からの制御
により減算が選択できる機能が付加されている。The accumulation function of the product-sum calculator 30.50 has an added function that allows selection of subtraction by external control.

すなわち、積和演算器３０．５０は、入力バスｌと記憶
回路２の出力ｐ１．ｐ３に各々供給される２つの入力系
列の積を計算し、内部のアキュムレータに累算する。積
和演算器３０．５０の構成は、第５図に示されている。That is, the product-sum calculator 30.50 receives the input bus l and the output p1. The product of the two input sequences respectively supplied to p3 is calculated and accumulated in an internal accumulator. The configuration of the product-sum calculator 30.50 is shown in FIG.

第５図に示すように、積和演算器３０．５０は、２つの
入力ボートに供給される値の積を演算する乗算器３１と
、乗算器３１の出力とアキュムレータ３３の値を加算す
る加算器３２を有し、加算器３２は、アキュムレータ側
の入力を１０″として乗算器３１の出力をそのままアキ
ュムレータ３３に出力することもできる。また、減算器
３４を備え、減算器３４は、アキュムレータ３３の値か
ら乗算器３工の出力値を差し引いて出力する。加算器３
２．　Ｍ算器３４の出力は、セレクタ３５に入力され、
制御端子ｌ８が“０”のときは加算器３２の出力が、“
１”のときは減算器３４の出力が選択され、アキュムレ
ータ３３に出力される。従って、この積和演算器では、
加算器３２を選択したときは、乗算器３１の出力をその
まま、もしくは、アキュムレータ３３の値と加えて、ア
キュムレータ３３に格納でき、減算器３４を選択したと
きには、アキュムレータ３３の値から乗算器３１の出力
を差し引いた値をアキュムレータ３３に格納できる。As shown in FIG. 5, the product-sum calculator 30.50 includes a multiplier 31 that calculates the product of values supplied to two input ports, and an adder that adds the output of the multiplier 31 and the value of the accumulator 33. The adder 32 can output the output of the multiplier 31 as it is to the accumulator 33 by setting the input on the accumulator side to 10''. Subtract the output value of multiplier 3 from the value of and output it. Adder 3
2. The output of the M calculator 34 is input to the selector 35,
When the control terminal l8 is "0", the output of the adder 32 is "
1", the output of the subtracter 34 is selected and output to the accumulator 33. Therefore, in this product-sum calculator,
When the adder 32 is selected, the output of the multiplier 31 can be stored as is or by adding it to the value of the accumulator 33, and when the subtracter 34 is selected, the output of the multiplier 31 can be stored from the value of the accumulator 33. The value obtained by subtracting the output can be stored in the accumulator 33.

このように、第３の発明に従う並列積和演算回路では、
複素演算も扱えるよう第１の発明に係る積和演算器の累
算機能の他、減算機能も持っている。In this way, in the parallel product-sum operation circuit according to the third invention,
In addition to the accumulation function of the product-sum calculator according to the first invention, it also has a subtraction function so that it can also handle complex operations.

第４の発明に係るベクトル行列積演算方法は、第３の発
明による上記構成の演算回路を用いて、Ｍ次元の複素ベ
クトルＣとＭ行Ｎ列の複素行列Ａの複素ベクトル行列積
を求める演算方法であって、記憶回路２には予め前記行
列Ａを構成する数値を格納し、積和演算器には、予め出
力複素ベクトルの実数部もしくは虚数部の計算を割り当
て、入力バス１には、外部からの制御によって前記複素
ベクトルＣの構成要素を実数部、虚数部交互に供給し、
第３の発明に依る減算機能を複素乗算の計算式に従って
選択することによって、複素演算を実現している。A vector-matrix product calculation method according to a fourth invention includes an operation for calculating a complex vector-matrix product of an M-dimensional complex vector C and a complex matrix A with M rows and N columns, using the calculation circuit having the above configuration according to the third invention. In this method, numerical values constituting the matrix A are stored in advance in the storage circuit 2, calculation of the real part or imaginary part of the output complex vector is assigned in advance to the product-sum calculator, and the input bus 1 is alternately supplying the real part and the imaginary part of the components of the complex vector C under external control;
Complex operations are realized by selecting the subtraction function according to the third invention according to the calculation formula of complex multiplication.

かかる複素ベクトル演算を可能とした第３．第４の発明
の原理を、〔２〕式の計１γ例で説明する。The third feature that enables such complex vector operations. The principle of the fourth invention will be explained using an example of formula [2] with a total of 1γ.

ここでは、２次元の複素ベクトルＣと２行２列の複素行
列の間で複素ベクトル行列積を求めている。Here, a complex vector matrix product is calculated between a two-dimensional complex vector C and a 2-by-2 complex matrix.

なお、ｊＺ＝１である。Note that jZ=1.

ディジタル計算では、複素数は、実部と虚部の２つの数
値の組で表されるため、〔２〕式の演算結果である２次
元複素ベクトルは、実際には４つの数値で表される。In digital calculations, a complex number is represented by a set of two numbers, a real part and an imaginary part, so the two-dimensional complex vector that is the result of the calculation of equation [2] is actually represented by four numbers.

従って、〔２〕弐の右辺のΣ記号で括られた４つの値Σ
（Ｃｉａｉｌ　　’ｊｂｉ＋）　＋　　Σ（Ｃ４ｂｉ＋
”ｄｉａｉ＋）＋　　Σ（Ｃ７ａｉＺ−ｄｉｂｌｇ）　
＋　　Σ（ＣｉｂｉＺ”ｄｆａｉ！）の計算を４つの積
和演算器を用意し、各々に割り当てる。このとき、１つ
の積和演算回路での計算に着目し、添え字ｉを固定する
と、括弧内の計算は、２回の乗算と１回の加算もしくは
減算であるから、ｉを１と２について計算するには、〔
１〕式の各要素の計算と同様４回の積和演算を要する。Therefore, the four values Σ enclosed in the Σ symbol on the right side of [2]
(Cial 'jbi+) + Σ(C4bi+
“diai+)+Σ(C7aiZ-diblg)
+ Σ (CibiZ"dfai!) is prepared by four product-sum calculation units and assigned to each. At this time, focusing on the calculation in one product-sum calculation circuit and fixing the subscript i, The calculation of is two multiplications and one addition or subtraction, so to calculate i for 1 and 2,
1] Similar to the calculation of each element in the equation, four product-sum operations are required.

更に、４回の積和演算の演算順序に注意すれば、〔１〕
式の計算法と同様、４つの積和演算器は、同時に同じ複
素ベクトルＣの要素の実数部Ｃ７または虚数部ｄ、を参
照する。従って、並列積和演算回路の構成は、第１の発
明と同じく入力バスを共通にし、複素行列Ａの要素は記
憶回路に予め格納すればよい。Furthermore, if we pay attention to the order of the four product-sum operations, [1]
Similar to the formula calculation method, the four product-sum calculators simultaneously refer to the real part C7 or the imaginary part d of the elements of the same complex vector C. Therefore, in the configuration of the parallel product-sum operation circuit, the input bus is shared in common as in the first invention, and the elements of the complex matrix A may be stored in advance in the storage circuit.

入力複素ベクトルＣの要素を、例えばＣＩ＋ｄｌ＋ＣＺ
＋ｄ２ｒ・・・の順に実数部、虚数部交互に供給すると
、〔２〕式の各乗算が計算されるように、ベクトルＣの
要素の入力順に合わせて４つの出力から読み出せばよい
。具体的には、第ｎ列の実数部を計算する積和演算器ａ
１１％＋　　ｂＩｆｉ＋　　ｂＺｎｒ　　ｂＺｎｒ・・
・の順に、第ｎ列の虚数部を計算する積和演算器にはｂ
Ｉ＋１＋　　”ＩＰｌ＋　　ｂＺｎｒ　　ｂＺｎｒ　　
”　’の順に読み出す。The elements of the input complex vector C, for example, CI+dl+CZ
When the real part and the imaginary part are alternately supplied in the order of +d2r..., it is sufficient to read out the four outputs in accordance with the input order of the elements of the vector C so that each multiplication in equation [2] is calculated. Specifically, a product-sum calculator a that calculates the real part of the n-th column
11%+ bIfi+ bZnr bZnr・・
・The product-sum calculator that calculates the imaginary part of the nth column has b
I+1+ ”IPl+ bZnr bZnr
” ' Read out in order.

ただし、出力の実数部に相当するΣ（ｃｉａｉ。However, Σ(ciai) corresponds to the real part of the output.

−ｄ　、　ｂ　ｉ、）の計算では、ｄ　、　ｂ　、、、
の項は、符号を反転して累算しなければならない。従っ
て、出力の実数部を担当する積和演算器３０．５０には
、乗算結果を加算する機能の他に減算する機能も持たせ
、ｄ、とす、ｆｉの乗算結果は、差し引くように外部か
ら制御可能としている。−d, b i,), d, b,,,
The terms must be accumulated with their signs reversed. Therefore, the product-sum calculator 30.50, which is in charge of the real part of the output, is provided with a subtracting function in addition to the function of adding the multiplication results. It can be controlled from

以下、第４図の並列積和演算回路によるベクトル行列積
演算法について第６図を参照して具体的に説明する。説
明には、−膜性を摂なうことなく上述の〔２〕式の計算
例を用いる。先ず、処理に先立って〔２〕式の行列Ａの
要素ａｌｌＩＩ＋　　ｂ□を入力バス１を介して記憶回
路２に書き込む。以下、ｋ＝１から４までの４ステツプ
で入力バス１には、〔２〕式のベクトルＣの要素を実数
部、虚数部交互に’Ｉｎ　ｄＩ＋　Ｃ２＋　ｄ、の順に
供給しつつ演算が実行される。Hereinafter, a vector matrix product calculation method using the parallel product-sum calculation circuit shown in FIG. 4 will be specifically explained with reference to FIG. 6. For the explanation, the calculation example of the above-mentioned formula [2] will be used without considering the -membrane property. First, prior to processing, the elements allII+b□ of the matrix A in equation [2] are written into the storage circuit 2 via the input bus 1. Hereinafter, in four steps from k=1 to 4, an operation is performed while supplying the elements of the vector C of equation [2] alternately in the order of 'In dI+ C2+ d' to the input bus 1 as the real part and the imaginary part. Ru.

ｋ−１では、入力バス１には、ベクトルＣの要素ｃ、を
供給し、同時に記憶回路２から各積和演算回路３０．　
４．５０．　６には、各々ａＩＩ＋　　ｂＩｌ＋ａ１ｔ
、ｂＩ！を読み出す。At k-1, element c of vector C is supplied to input bus 1, and at the same time element c of vector C is supplied from storage circuit 2 to each product-sum calculation circuit 30.
4.50. 6 has aII+bIl+alt, respectively.
,bI! Read out.

積和演算器３０では、乗算器３１がＣ３と同時に記憶回
路２からｐｌに読み出される値ａｌｌの積Ｃ１ａ、を出
力する。また、制御端子１８には“０”を供給し、セレ
クタ３５により加算器３２の出力を選択し、かつ、加算
器３２が乗算器３１の出力をそのまま出力することで、
アキュムレータ３３にはＣＩａｌｌが格納される。積和
演算器４，５０．６でも積和演算器３０と同様の動作を
行い、各積和演算器のアキュムレータには、’１ｂｌｌ
＋　　Ｃ１ａ、ｇ、ｃ＋ｂ＋ｚが格納される。In the product-sum calculator 30, the multiplier 31 outputs the product C1a of the value all read out from the storage circuit 2 to pl at the same time as C3. Furthermore, by supplying "0" to the control terminal 18, selecting the output of the adder 32 by the selector 35, and causing the adder 32 to directly output the output of the multiplier 31,
CIall is stored in the accumulator 33. The product-sum calculators 4 and 50.6 perform the same operation as the product-sum calculator 30, and the accumulator of each product-sum calculator has '1bll'.
+ C1a, g, c+b+z are stored.

ｋ＝２では、複素ベクトルＣの虚数部ｄ、が入カバスエ
に供給され、積和演算器３０では、乗算器３１が、ｄ、
と同時に記憶回路２からｐｉに読み出されるす、の積を
出力する。このとき、制御端子１８に１１″を与えてア
キュムレータ３３に減算器３４の出力を供給し、アキュ
ムレータ３３の値ＣＩａｌｌから乗算器３１の出力ｄ、
ａｍｎを減算し、Ｃ１ａｌｌｄｌｂ１１をアキュムレー
タ３３に格納する。積和演算器５０でもｐ３に読み出さ
れたｂｌ！とｄ、に対して積和演算器３０と同様の動作
を行い、積和演算器５０のアキュムレータには、Ｃｌ　
ａ　Ｉｔ　　ｄ　ｌ　ｂ　１１が格納される。一方、積
和演算器４．６では、入力バス１に供給されたｄ、と、
記憶回路２からｐ２゜ｐ４に読み出されたａｌｌ＋　　
”Ｉ２の積ｄｌ　ａｌＩ＋　　ｄ１ａ１□が、各々のア
キュムレータに加え込まれｃ１ｂｌｌ＋ａｌａ１１．ｃ
ｌｂｌ！＋ａｌａｌ！とする。When k=2, the imaginary part d of the complex vector C is supplied to the input buffer, and in the product-sum calculator 30, the multiplier 31
At the same time, the product of S and S read out from the memory circuit 2 to pi is output. At this time, 11'' is applied to the control terminal 18 to supply the output of the subtracter 34 to the accumulator 33, and from the value CIall of the accumulator 33, the output d of the multiplier 31,
amn is subtracted and C1alldlb11 is stored in the accumulator 33. The product-sum calculator 50 also reads out bl! to p3! and d, the same operation as the product-sum calculator 30 is performed, and the accumulator of the product-sum calculator 50
a It d l b 11 is stored. On the other hand, in the product-sum calculator 4.6, d supplied to the input bus 1 and
all+ read out from memory circuit 2 to p2゜p4
``The product dl alI+ d1a1□ of I2 is added to each accumulator c1bll+ala11.c
lbl! +alal! shall be.

ｋ＝３では、入力バス１には、ベクトルＣの要素Ｃ２を
供給し、同時に記憶回路２から各積和演算回路３０．　
４．５０．　６には、各々ａ！Ｉｎ　　’）２１゜ａ　
２１．　　ｂ　！！を読み出す。When k=3, element C2 of vector C is supplied to input bus 1, and at the same time, element C2 of vector C is supplied from storage circuit 2 to each product-sum calculation circuit 30.
4.50. 6, each a! In')21゜a
21. b! ! Read out.

積和演算器３０では、乗算器３１が０２と同時に記憶回
路２からｐｌに供給される（！ａ、、の積を出力する。In the product-sum calculator 30, the multiplier 31 outputs the product of (!a,,) which is supplied to pl from the storage circuit 2 at the same time as 02.

また、制御端子１８には“Ｏ”を与えて、セレクタ３５
により加算器３２を選択し、アキュムレータの値との和
を出力し、アキュムレータ３３に格納する。積和演算器
５０でも積和演算器３０と同じ動作を行う、これにより
、積和演算器３０．５０のアキュムレータには、各々’
　ｌ　ａｌ　Ｉ−ｄｌ　ｂｌ　Ｉ　＋Ｃｔ　ａｌ　１　
＋Ｃ１ａｌ！−ｄｌｂｌ！＋Ｃ！ａｌ！が格納される。Further, "O" is applied to the control terminal 18, and the selector 35
selects the adder 32, outputs the sum with the accumulator value, and stores it in the accumulator 33. The product-sum calculator 50 also performs the same operation as the product-sum calculator 30. As a result, the accumulators of the product-sum calculator 30 and 50 each have ''.
l al I-dl bl I +Ct al 1
+C1al! -dlbl! +C! Al! is stored.

積和演算器４，６では、入力バスｌに供給されたＣ２と
、記憶回路２からｐ２．ｐ４に各々読み出されるｂ　２
＋、　　ｂ　ｆｆｉ！に対して、ｋ＝２のときと同じ動
作を繰り返し、Ｃ１ｂｌｌ＋ｄｌａｌｌ＋Ｃ！ｂ！１．
Ｃ１ｂｌ！＋ｃｌ＋ａ、、＋Ｃ１ｂ２ｇが同時に格納さ
れる。In the product-sum calculation units 4 and 6, C2 supplied to the input bus l and p2 . b 2 read out in p4 respectively
+, b ffi! The same operation as when k=2 is repeated for C1bll+dllall+C! b! 1.
C1bl! +cl+a, , +C1b2g are stored simultaneously.

ｋ＝４では、入力バス１に供給されるｄ２と、記憶回路
２からｐｉ、ｐ２．ｐ３．ｐ４に同時に読み出されるｂ
！Ｉ＋　　ａ！ｌ＋　　ｂ　！！＋　　３２２について
に＝２のときとそれぞれが同じ動作を繰り返す。When k=4, d2 supplied to input bus 1 and pi, p2 . p3. b read simultaneously to p4
! I+a! l+b! ! For +322, each repeats the same operation as when =2.

以上のようにに１から４間での４ステツプで〔２〕式の
結果を構成する４つの値が各積和演算器のアキュムレータに求まる。求まった値
は、同時にラッチ回路に格納され、図示せずも外部から
の制御により出力バス１２に読み出される。As described above, four values constituting the result of equation [2] are obtained in the accumulator of each product-sum calculator in four steps between 1 and 4. The determined value is simultaneously stored in the latch circuit and read out to the output bus 12 under external control (not shown).

〔Effect of the invention〕

以上述べたように、本発明による並列積和演算回路及び
ベクトル行列積演算方法によれば、予め記憶回路に格納
した値と、入力バスに供給する値を定めることにより積
和演算回路を並列に動作させてベクトル行列積を求める
ことができる。また、記憶回路に行列の要素を格納する
ので、対称行列などのように重複した値を持つ行列の場
合は、メモリ容量が少なくてよく、記憶回路の有効利用
ができる。更に、演算方法は、出力ベクトルの要素毎に
分割しているため、操り返し演算によって積和演算器数
よりも大きなベクトルの計算もプログラミングが容易で
ある。As described above, according to the parallel product-sum calculation circuit and vector matrix product calculation method according to the present invention, the product-sum calculation circuit can be operated in parallel by determining the value stored in the storage circuit in advance and the value supplied to the input bus. It can be operated to obtain vector matrix products. Furthermore, since the elements of the matrix are stored in the memory circuit, in the case of a matrix with duplicated values, such as a symmetric matrix, the memory capacity may be small, and the memory circuit can be used effectively. Furthermore, since the calculation method divides the output vector into elements, it is easy to program the calculation of vectors larger than the number of product-sum calculation units by repeating calculations.

[Brief explanation of the drawing]

第１図は第１の発明の一実施例を示す図、第２図は積和
演算器の詳細な構成の一例を示す図、第３図は第２の発明である演算方法の説明に供する図、第４図は第３の発明の一実施例を示す図、第５図は減算
機能をも持たせた積和演算器の詳細な構成の一例を示す
図、第６図は第４の発明である演算方法の説明に供する図で
ある。１・・・・・入力バス２・・・・・記憶回路３〜６．３０．５０・・・積和演算器１３〜１６・・・ラッチ回路FIG. 1 is a diagram showing an embodiment of the first invention, FIG. 2 is a diagram showing an example of a detailed configuration of a product-sum calculator, and FIG. 3 is for explaining the calculation method of the second invention. 4 is a diagram showing an embodiment of the third invention, FIG. 5 is a diagram showing an example of a detailed configuration of a product-sum calculator that also has a subtraction function, and FIG. FIG. 2 is a diagram illustrating a calculation method according to the invention. 1...Input bus 2...Storage circuit 3-6.30.50...Product-sum calculator 13-16...Latch circuit

Claims

[Claims]

(1) An input bus and a memory circuit that can simultaneously read out values stored at different addresses to N outputs, each of which is connected to the input bus and one output of the memory circuit. N product-sum calculators for accumulating and storing the two values calculated; N latch circuits each connected to the N product-sum calculators to take in the calculation results; and updating the contents of the storage circuit. means for initializing the storage contents of the product-sum calculator;
1. A parallel product-sum operation circuit comprising: means for reading out values of two latch circuits to the outside.

(2) Using the parallel product-sum calculation circuit according to claim 1, an M-dimensional vector C=(c_1c_2...c_M) and M rows N
Column matrix A=(a_m_n|m=1, 2,...,M;
Calculate the vector matrix product C・A of n=1, 2, ..., N), and calculate the N-dimensional vector E (e_1e_2...e_N)
A vector matrix product calculation method for calculating, wherein the value of the element a_m_n of the matrix A is stored in the storage circuit in advance, and the element c_m of the input vector C is stored in the input bus as c_1, c_2, . . . c_M, and the elements of the column vectors constituting the matrix A are each a
_1_n, a_2_n, ..., a_M_n are sequentially read in the order, and the N product-sum calculation units calculate the element e_n of the vector E by the inner product operation of the input vector C and the N column vectors. A vector matrix product calculation method, characterized in that the results are simultaneously taken into the latch circuit and then read out to the outside.

(3) The parallel product-sum calculation circuit according to claim 1, wherein the function of the product-sum calculation unit includes a function in which subtraction can be selected by external control.

(4) Using the parallel product-sum calculation circuit according to claim 3, an M-dimensional complex vector C=(c_m+jd_m|m=1, 2
,...,M) and a complex matrix of M rows and N columns A=(a_m_n
+jb_m_n | m=1, 2, ..., M; n=1, 2
, ..., N), calculate the complex vector matrix product C・A,
N-dimensional complex vector E=(e_n+jf_n|n=1
, 2, .
The components of the complex vector C are sequentially supplied to the input bus in the order of c_1, d_1, c_2, d_2, etc., and the real part of the complex vector E is stored in advance in the product-sum calculator. e_n or the imaginary part is assigned to the product-sum calculator which is in charge of calculating the real part e_n.
b_1_n, a_2_n, b_2_n, ..., a_M
_n, b_M_n are read out in order, and the addition and subtraction functions of the product-sum calculator are used alternately to calculate
(a_i_nc_i - b_i_nd_i), and the product-sum calculator in charge of calculating the imaginary part f_n calculates the components of the column vectors of matrix A as b_1_n, a_1_n, b_
2_n, a_2_n, ..., b_M_n, a_M_n
are read out sequentially in the order of Σ(b
_i_nc_i+a_i_nd_i), and 2N numerical values from the first element to the Nth element of the output vector E are simultaneously calculated in the corresponding 2N product-sum calculators,
The vector matrix product calculation method is characterized in that the determined inner product values are simultaneously fetched into corresponding latch circuits and then read out to the outside.