JPH08235129A

JPH08235129A - Parallel processor

Info

Publication number: JPH08235129A
Application number: JP3744495A
Authority: JP
Inventors: Mitsuharu Oki; 光晴大木; Masuyoshi Kurokawa; 益義黒川
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1995-02-27
Filing date: 1995-02-27
Publication date: 1996-09-13

Abstract

PURPOSE: To efficiently perform multiplication with grouped filters whose coefficient values are not 0 like a filter used for a ghost canceler. CONSTITUTION: For data which are inputted at a sufficient interval of time, plural serial/parallel converters 1-3 are shifted in timing and then the passing and receiving time of the data is shortened by bringing necessary data to respective processor elements 4(j) or nearby processor elements 4(j) without passing and receiving the data to and from adjacent processor elements 4(j), thereby ending computation within one horizontal period. Namely, serial/parallel conversion is performed in different timing.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、映像信号のディジタル
処理等に用いられる並列プロセッサに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a parallel processor used for digital processing of video signals.

【０００２】[0002]

【従来の技術】映像信号のディジタル処理を行う装置と
して、例えば「SVP:SERIAL VIDEO PROCESSOR/Proceedin
gs of the IEEE 1990 CUSTOM INTEGRATED CIRCUITS CON
FERENCE/P.17.3.1〜4」に示される装置が知られてい
る。2. Description of the Related Art As a device for digitally processing a video signal, for example, "SVP: SERIAL VIDEO PROCESSOR / Proceedin
gs of the IEEE 1990 CUSTOM INTEGRATED CIRCUITS CON
FERENCE / P.17.3.1-4 "is known.

【０００３】この装置は、具体的には例えば図１０に示
すような並列プロセッサで構成されている。即ち、図１
０において、例えば、各画素がそれぞれ複数ビットで構
成される映像信号がワード（画素）シリアルでシリアル
データ入力端子（以下、ＳＩＮ）から供給され、「１水
平期間（１Ｈ）分の容量（ｍ）を有するシリアル／パラ
レル変換器」（以下、ＳＰと略記する）１０１に格納さ
れる。このＳＰ１０１に格納されたｍ個のデータは、そ
れぞれ対応するｍ個のパラレルデータ出力端子（ＰＯＵ
Ｔ１〜ＰＯＵＴｍ）から出力される。なお、図１０にお
いては、簡単のためｍ＝１００としている。一般的に
は、ｍは１水平期間分のデータ数であるので、数１００
〜数１０００である。This device is specifically constituted by a parallel processor as shown in FIG. 10, for example. That is, FIG.
0, for example, a video signal in which each pixel is composed of a plurality of bits is word (pixel) serially supplied from a serial data input terminal (hereinafter, SIN), and "capacity (m) for one horizontal period (1H)" is supplied. And a serial / parallel converter having “” (hereinafter, abbreviated as SP) 101. The m pieces of data stored in the SP 101 are the corresponding m pieces of parallel data output terminals (POU).
T1 to POUTm). In FIG. 10, m = 100 for simplicity. In general, m is the number of data for one horizontal period, so several hundreds
~ A few thousand.

【０００４】上述のパラレルデータ出力端子（ＰＯＵＴ
１〜ＰＯＵＴｍ）から出力されたｍ個のデータは、それ
ぞれ対応するｍ個のプロセッサエレメント（以下、ＰＥ
と略記する）１０２（１）〜ＰＥ１０２（ｍ）に入力さ
れる。各ＰＥ１０２（ｊ）には、後述するメモリと演算
回路があり、制御回路１０３からの制御信号により、所
望の演算が行われる。The above-mentioned parallel data output terminal (POUT
1 to POUTm), the m pieces of data are respectively associated with m corresponding processor elements (hereinafter, PE).
102 (1) to PE 102 (m). Each PE 102 (j) has a memory and an arithmetic circuit to be described later, and a desired arithmetic operation is performed by a control signal from the control circuit 103.

【０００５】また、各ＰＥ１０２（ｊ）は、隣り合うＰ
Ｅ１０２（ｊ−１）、ＰＥ１０２（ｊ＋１）とデータの
受渡しが出来るように配線されている。このデータの受
渡しも制御回路１０３により制御されている。Further, each PE 102 (j) has an adjacent P
It is wired so that data can be exchanged with E102 (j-1) and PE102 (j + 1). The delivery of this data is also controlled by the control circuit 103.

【０００６】ＰＥ１０２（１）〜ＰＥ１０２（ｍ）で所
望の演算が行われた後、その結果であるｍ個のデータ
は、「１水平期間（１Ｈ）分の容量（ｍ）を有するパラ
レル／シリアル変換器」（以下、ＰＳと略記する）１０
４のｍ個のパラレルデータ入力端子（ＰＩＮ１〜ＰＩＮ
ｍ）にそれぞれ入力される。パラレルデータ入力端子
（ＰＩＮ１〜ＰＩＮｍ）からパラレルに入力されたデー
タは、ワード（画素）シリアルに変換されシリアルデー
タ出力端子（ＳＯＵＴ）からワード（画素）シリアルに
出力される。After the PE 102 (1) to PE 102 (m) have performed a desired operation, the resulting m pieces of data are "parallel / serial having a capacity (m) of one horizontal period (1H)". Converter "(hereinafter abbreviated as PS) 10
4 m parallel data input terminals (PIN1 to PIN
m) respectively. The data input in parallel from the parallel data input terminals (PIN1 to PINm) is converted into word (pixel) serial and output in word (pixel) serial from the serial data output terminal (SOUT).

【０００７】従って、この装置において、水平期間毎に
ＳＰ１０１に供給された映像信号の各画素のデータは、
その後の水平ブランキング期間内にＰＥ１０２（ｊ）に
格納される。このＰＥ１０２（ｊ）に格納されたデータ
が次の１水平期間の間にＰＥ１０２（ｊ）内の演算回路
にて演算処理される。そして、その後の水平ブランキン
グ期間内に、ＰＥ１０２（ｊ）内にある演算処理された
データがＰＳ１０４に入力され、各水平期間毎に演算処
理された映像信号が取り出される。このようにして例え
ば映像信号のディジタル処理が行われる。Therefore, in this device, the data of each pixel of the video signal supplied to the SP 101 every horizontal period is
It is stored in the PE 102 (j) within the subsequent horizontal blanking period. The data stored in the PE 102 (j) is arithmetically processed by the arithmetic circuit in the PE 102 (j) during the next one horizontal period. Then, during the subsequent horizontal blanking period, the arithmetically processed data in the PE 102 (j) is input to the PS 104, and the arithmetically processed video signal is taken out for each horizontal period. In this way, for example, digital processing of the video signal is performed.

【０００８】また、ＰＥ１０２（１）〜ＰＥ１０２
（ｍ）での演算を制御するための制御回路１０３は、１
つのみであり、ｍ個全てのＰＥに共通のものである。即
ち、図４は１水平期間分のデータ数（ｍ）と同じ個数の
ＰＥを有するＳＩＭＤ（Single Instruction Multiple
Data）方式の並列プロセッサである。ビデオ信号処理に
おいては全ての画素に対して同じ演算処理をすることが
多いので、全てのプロセッサエレメントに同一の処理命
令を与えるＳＩＭＤ方式で充分に対応でき不便はない。
そして、ＳＩＭＤ方式ならば制御回路は１つで済み回路
規模が小さくなるという利点がある。Further, PE 102 (1) to PE 102
The control circuit 103 for controlling the calculation in (m) is
It is only one and is common to all m PEs. That is, FIG. 4 shows a SIMD (Single Instruction Multiple) having the same number of PEs as the number of data (m) for one horizontal period.
Data) method parallel processor. In video signal processing, since the same arithmetic processing is often performed on all pixels, the SIMD method in which the same processing instruction is given to all processor elements can be sufficiently applied and there is no inconvenience.
The SIMD method has an advantage that only one control circuit is required and the circuit scale is reduced.

【０００９】ここで、従来回路の例である上述したＳＩ
ＭＤ方式の並列プロセッサについて詳しく説明すること
にする。Here, the above-mentioned SI which is an example of a conventional circuit is used.
The MD parallel processor will be described in detail.

【００１０】図１０に示したＳＰ１０１は、図１１に示
すように、レジスタ１１１（１）〜１１１（１００）、
１ビット単位遅延素子１１２（１）〜１１２（１０
０）、スイッチ１１３（１）〜１１３（１００）とから
構成されるＵ１００がある。遅延素子１１２（１）〜１
１２（１００）は、直列に接続されており、ライトポイ
ンタ入力端子（ＷＰＴＲ）からのオン信号は、遅延素子
１１２（１）、遅延素子１１２（２）、遅延素子１１２
（３）、…遅延素子１１２（１００）へと順次送られて
いく。レジスタ１１１（ｊ）（ｊ＝１〜１００）の入力
部には、スイッチ１１３（ｊ）があり、スイッチ１１３
（ｊ）がオンされればシリアルデータ入力端子（ＳＩ
Ｎ）からのデータがレジスタ１１１（ｊ）に格納され
る。遅延素子１１２（ｊ）の出力は、スイッチ１１３
（ｊ）のオン信号として使われる。As shown in FIG. 11, the SP 101 shown in FIG. 10 has registers 111 (1) to 111 (100),
1-bit unit delay elements 112 (1) to 112 (10
0) and switches 113 (1) to 113 (100). Delay elements 112 (1) to 1
12 (100) are connected in series, and the ON signal from the write pointer input terminal (WPTR) receives the delay element 112 (1), the delay element 112 (2), and the delay element 112.
(3), ... It is sequentially sent to the delay element 112 (100). The switch 113 (j) is provided in the input section of the register 111 (j) (j = 1 to 100).
If (j) is turned on, the serial data input terminal (SI
The data from N) is stored in register 111 (j). The output of the delay element 112 (j) is the switch 113.
It is used as the ON signal of (j).

【００１１】レジスタ１１１（ｊ）の出力は、パラレル
データ出力端子（ＰＯＵＴｊ）を介してＰＥ１０２
（ｊ）に入力される。図１２（ｂ）に示すようにライト
ポインタ入力端子（ＷＰＴＲ）にオン信号を与えること
により、図１２（ａ）に示すように時刻Ｔ２１期間にシ
リアルデータ入力端子（ＳＩＮ）にシリアルに入力され
てくるデータＤｉ１、Ｄｉ２、Ｄｉ３、…、Ｄｉ９９、
Ｄｉ１００は、それぞれ、図１２（ｃ）に示すようにレ
ジスタ１１１（１）〜１１１（１００）に格納され、Ｐ
ＯＵＴ１、ＰＯＵＴ２、ＰＯＵＴ３、．．．、ＰＯＵＴ
９９、ＰＯＵＴ１００より出力される。特に、時刻Ｔ２
２においては、全てのレジスタにＤｉｊが格納されてお
り、全てのパラレルデータ出力端子（ＰＯＵＴｊ）（ｊ
＝１〜１００）からＤｉｊが出力されている。The output of the register 111 (j) is sent to the PE 102 via the parallel data output terminal (POUTj).
Input to (j). By applying an ON signal to the write pointer input terminal (WPTR) as shown in FIG. 12 (b), it is serially input to the serial data input terminal (SIN) during time T21 as shown in FIG. 12 (a). Incoming data Di1, Di2, Di3, ..., Di99,
Di100 is respectively stored in registers 111 (1) to 111 (100) as shown in FIG.
OUT1, POUT2, POUT3 ,. ．． , POUT
It is output from 99 and POUT100. Especially at time T2
In No. 2, Dij is stored in all registers, and all parallel data output terminals (POUTj) (j
= 1 to 100), and Dij is output.

【００１２】各ＰＥ１０２（ｊ）（ｊ＝１〜１００）
は、図１３に示すように、メモリ１２１及び演算回路１
２２から構成されている。ＳＰ１０１からのデータＤｉ
ｊは、時刻Ｔ２２において、入力端子（ＰＥＩＮ）を介
して、メモリ１２１に格納される。メモリ１２１に格納
されたデータは、演算回路１２２により、所望の演算が
行われる。この時、必要に応じて、入出力端子（ＬＦ
Ｔ、ＲＧＴ）を介して、隣のＰＥ１０２（ｊ−１）、Ｐ
Ｅ１０２（ｊ＋１）とデータの受渡しを行う。ＬＦＴ端
子は、ＰＥ１０２（ｊ−１）とのデータの受渡しを行う
ための端子であり、ＲＧＴ端子は、ＰＥ１０２（ｊ＋
１）とデータの受渡しを行うための端子である。Each PE 102 (j) (j = 1 to 100)
Is the memory 121 and the arithmetic circuit 1 as shown in FIG.
It is composed of 22. Data Di from SP101
j is stored in the memory 121 via the input terminal (PEIN) at time T22. The arithmetic circuit 122 performs a desired arithmetic operation on the data stored in the memory 121. At this time, if necessary, input / output terminals (LF
T, RGT), the adjacent PE 102 (j-1), P
Data is exchanged with E102 (j + 1). The LFT terminal is a terminal for exchanging data with the PE 102 (j-1), and the RGT terminal is the PE 102 (j +).
This is a terminal for transferring data to and from 1).

【００１３】各ＰＥ１０２（ｊ）（ｊ＝１〜１００）で
演算された結果であるデータＤＯＵＴｊ（ｊ＝１〜１０
０）は、出力端子（ＰＥＯＵＴ）から出力され、それぞ
れ、ＰＳ１０４のパラレルデータ入力端子ＰＩＮｊ（ｊ
＝１〜１００）に入力される。先に述べた通り、この演
算過程におけるＰＥ１０２（ｊ）（ｊ＝１〜１００）の
制御は、制御回路１０３により行われる。Data DOUTj (j = 1 to 10) which is the result calculated by each PE 102 (j) (j = 1 to 100)
0) is output from the output terminal (PEOUT), and the parallel data input terminals PINj (j
= 1 to 100). As described above, the control circuit 103 controls the PEs 102 (j) (j = 1 to 100) in this calculation process.

【００１４】ＰＳ１０４は、図１４に示すように、レジ
スタ１３１（１）〜１３１（１００）、１ビット単位遅
延素子１３２（１）〜１３２（１００）、スイッチ１３
３（１）〜１３３（１００）から構成される。遅延素子
１３２（１）〜１３２（１００）は、直列に接続されて
おり、リードポインタ入力端子（ＲＰＴＲ）からのオン
信号は、遅延素子１３２（１）、遅延素子１３２
（２）、遅延素子１３２（３）、…、遅延素子１３２
（１００）へと順次送られていく。パラレルデータ入力
端子（ＰＩＮｊ）（ｊ＝１〜１００）から入力されてく
るデータＤＯＵＴｊ（ｊ＝１〜１００）は、それぞれ、
一度、レジスタ１３１（ｊ）に格納される。レジスタ１
３１（ｊ）（ｊ＝１〜１００）の出力部には、スイッチ
１３３（ｊ）があり、スイッチ１３３（ｊ）がオンされ
ればレジスタ１３１（ｊ）に格納されているデータが、
シリアルデータ出力端子（ＳＯＵＴ）より出力される。As shown in FIG. 14, the PS 104 includes registers 131 (1) to 131 (100), 1-bit unit delay elements 132 (1) to 132 (100), and a switch 13.
3 (1) to 133 (100). The delay elements 132 (1) to 132 (100) are connected in series, and the ON signal from the read pointer input terminal (RPTR) receives the delay element 132 (1) and the delay element 132.
(2), delay element 132 (3), ..., Delay element 132
It is sequentially sent to (100). The data DOUTj (j = 1 to 100) input from the parallel data input terminal (PINj) (j = 1 to 100) are respectively
Once stored in the register 131 (j). Register 1
The output part of 31 (j) (j = 1 to 100) has a switch 133 (j). When the switch 133 (j) is turned on, the data stored in the register 131 (j) is
It is output from the serial data output terminal (SOUT).

【００１５】遅延素子１３２（ｊ）（ｊ＝１〜１００）
の出力は、スイッチ１３３（ｊ）のオン信号として使わ
れる。つまり、リードポインタ入力端子（ＲＰＴＲ）に
オン信号を与えることにより、データＤＯＵＴ１、ＤＯ
ＵＴ２、ＤＯＵＴ３、…、ＤＯＵＴ９９、ＤＯＵＴ１０
０が順次シリアルに、シリアルデータ出力端子（ＳＯＵ
Ｔ）から出力される。Delay element 132 (j) (j = 1 to 100)
Is used as an ON signal of the switch 133 (j). That is, by giving an ON signal to the read pointer input terminal (RPTR), the data DOUT1, DO
UT2, DOUT3, ..., DOUT99, DOUT10
0 serially, serial data output terminal (SOU
It is output from T).

【００１６】この流れを図１５のタイミング図を用いて
説明すると、図１５（ａ）に示すようにＰｉｊ（ｊ＝１
〜１００）からデータＤＯＵＴｊが入力されると、それ
ぞれ、一度、図１５（ｂ）に示すようにレジスタ１３１
（ｊ）に格納される。そして、時刻Ｔ３１において図１
５（ｃ）に示すようにリードポインタ入力端子（ＲＰＴ
Ｒ）にオン信号を入力すると、図１５（ｄ）に示すよう
に、その後の時刻Ｔ３２において、シリアルデータ出力
端子（ＳＯＵＴ）から、順次、ＤＯＵＴｉが出力され
る。This flow will be described with reference to the timing chart of FIG. 15. As shown in FIG. 15A, Pij (j = 1
15 to 100), the data DOUTj is input once to the register 131 as shown in FIG.
It is stored in (j). Then, at time T31, as shown in FIG.
As shown in FIG. 5 (c), the read pointer input terminal (RPT
When the ON signal is input to R), DOUTi is sequentially output from the serial data output terminal (SOUT) at time T32 thereafter, as shown in FIG.

【００１７】この従来の並列プロセッサの全体の動作の
タイミングチャートを図１６に示す。FIG. 16 shows a timing chart of the overall operation of this conventional parallel processor.

【００１８】図１６（ａ）に示すように、映像信号は、
図１１のＳＰ１０１のシリアルデータ入力端子（ＳＩ
Ｎ）からワード（画素）シリアルに供給される（時刻Ｔ
１）。そして、図１６（ｂ）に示すように、１水平期間
の最初のデータが入力される直前に、ＳＰ１０１のライ
トポインタ入力端子（ＷＰＴＲ）から、オン信号を入力
する。これにより、最初の入力データ（Ｄｉ１）は、オ
ン信号が遅延素子１１２（１）に送られていてスイッチ
１１３（１）がオンになるので、レジスタ１１１（１）
に格納される。次の入力データ（Ｄｉ２）は、オン信号
が遅延素子１１２（２）に送られていてスイッチ１１３
（２）がオンとなるので、レジスタ１１１（２）に格納
される。以降、同様にして入力データ（Ｄｉ３〜Ｄｉ１
００）が、レジスタ１１１（３）〜レジスタ１１１（１
００）に格納される。即ち、１水平期間（１Ｈ）分のデ
ータがレジスタ１１１（３）〜レジスタ１１１（１０
０）に格納される。As shown in FIG. 16A, the video signal is
The serial data input terminal (SI
N) is serially supplied from the word (pixel) (time T).
1). Then, as shown in FIG. 16B, an ON signal is input from the write pointer input terminal (WPTR) of the SP 101 immediately before the first data of one horizontal period is input. As a result, the first input data (Di1) has the ON signal sent to the delay element 112 (1) and the switch 113 (1) is turned on, so that the register 111 (1) is turned on.
Stored in. For the next input data (Di2), the ON signal is sent to the delay element 112 (2) and the switch 113
Since (2) is turned on, it is stored in the register 111 (2). Thereafter, similarly, input data (Di3 to Di1
00) is the register 111 (3) to the register 111 (1
00). That is, data for one horizontal period (1H) is stored in the registers 111 (3) to 111 (10).
0).

【００１９】その後の水平ブランキング期間（時刻Ｔ
２）内において、図１６（ｃ）に示すように、レジスタ
１１１（３）〜レジスタ１１１（１００）に格納されて
いるデータ（Ｄｉ１〜Ｄｉ１００）を、ＳＰ１０１のパ
ラレルデータ出力端子（ＰＯＵＴ１〜ＰＯＵＴ１００）
を介して、ＰＥ１０２（１）〜ＰＥ１０２（１００）に
それぞれ供給する。Subsequent horizontal blanking period (time T
In (2), as shown in FIG. 16C, the data (Di1 to Di100) stored in the registers 111 (3) to 111 (100) are transferred to the parallel data output terminals (POUT1 to POUT100) of the SP101.
Through PE 102 (1) to PE 102 (100).

【００２０】続く１水平水平期間の間（時刻Ｔ３）に所
望の演算を行う。即ち、各ＰＥ１０２（ｊ）（ｊ＝１〜
１００）は、時刻Ｔ２において、受け取ったデータＤｉ
ｊをメモリ１２１に格納し、時刻Ｔ３内で、このメモリ
１２１からデータを読みだし、演算回路１２２にて演算
を行い、その演算結果（ＤＯＵＴｊ）をメモリ１２１に
再格納する。この制御は、ＣＴＲＬ制御回路１０３によ
り行われる。A desired calculation is performed during the following one horizontal horizontal period (time T3). That is, each PE 102 (j) (j = 1 to 1
100) is the received data Di at time T2.
j is stored in the memory 121, data is read from the memory 121 within the time T3, the arithmetic circuit 122 performs arithmetic operation, and the arithmetic result (DOUTj) is stored again in the memory 121. This control is performed by the CTRL control circuit 103.

【００２１】さらに、その後の水平ブランキング期間
（時刻Ｔ４）内において、図１６（ｄ）に示すように、
ＤＯＵＴｊ（ｊ＝１〜１００）は、ＰＳ１０４のパラレ
ルデータ入力端子ＰＩＮ１〜ＰＩＮ１００を介して、Ｐ
Ｓ１０４内のレジスタ１３１（１）〜１３１（１００）
に供給される。Further, in the subsequent horizontal blanking period (time T4), as shown in FIG.
DOUTj (j = 1 to 100) is connected to P104 via the parallel data input terminals PIN1 to PIN100 of PS104.
Registers 131 (1) to 131 (100) in S104
Is supplied to.

【００２２】格納した後の１水平期間（時刻Ｔ５）の直
前において、図１６（ｅ）に示すように、ＰＳ１０４の
リードポインタ入力端子（ＲＰＴＲ）から、オン信号を
入力する。これにより、オン信号が遅延素子１３２
（１）に送られて最初にスイッチ１３３（１）がオンに
なるのでレジスタ１３１（１）に格納されていた演算結
果データＤＯＵＴ１がシリアルデータ出力端子（ＳＯＵ
Ｔ）から出力される。続いて、オン信号は遅延素子１３
２（２）に送られるので、スイッチ１３３（２）がオン
となり、レジスタ１３１（２）に格納されていた演算結
果データＤＯＵＴ２がシリアルデータ出力端子（ＳＯＵ
Ｔ）から出力される。以降、同様にしてレジスタ１３１
（３）〜１３１（１００）に格納されていた演算結果デ
ータＤＯＵＴ３〜ＤＯＵＴ１００がシリアルデータ出力
端子（ＳＯＵＴ）から出力される。即ち、図１６（ｆ）
に示すように、１水平期間（１Ｈ）分の演算結果のデー
タがＳＯＵＴからワード（画素）シリアルに出力され
る。Immediately before one horizontal period (time T5) after storing, as shown in FIG. 16 (e), an ON signal is input from the read pointer input terminal (RPTR) of the PS 104. This causes the ON signal to be delayed by the delay element 132.
Since the switch 133 (1) is first turned on after being sent to (1), the operation result data DOUT1 stored in the register 131 (1) is transferred to the serial data output terminal (SOU).
It is output from T). Then, the ON signal is transmitted to the delay element 13
2 (2), the switch 133 (2) is turned on, and the operation result data DOUT2 stored in the register 131 (2) is transferred to the serial data output terminal (SOU).
It is output from T). After that, in the same manner, the register 131
The operation result data DOUT3 to DOUT100 stored in (3) to 131 (100) are output from the serial data output terminal (SOUT). That is, FIG. 16 (f)
As shown in, data of the calculation result for one horizontal period (1H) is serially output from SOUT in a word (pixel) manner.

【００２３】１水平期間遅れた次のデータ（Ｄｉ１’〜
Ｄｉ１００’）や、さらに１水平期間遅れた次のデータ
（Ｄｉ１”〜Ｄｉ１００”）も、同様の操作が行われ、
所望の演算が行われ、その結果であるデータ（ＤＯＵＴ
１’〜ＤＯＵＴ１００’、ＤＯＵＴ１”〜ＤＯＵＴ１０
０”）が出力される。The next data (Di1 '...
Di100 ′) and the next data (Di1 ″ to Di100 ″) that is further delayed by one horizontal period are subjected to the same operation,
The desired operation is performed and the resulting data (DOUT
1'to DOUT100 ', DOUT1 "to DOUT10
0 ") is output.

【００２４】さて、上述の「所望の演算」の例として、
図１７に示す入力画像データ（Ｄｉ１〜Ｄｉ１００）に
対して、フィルタをかける場合を考えよう。フィルタ係
数は、例えば図１８に示すものとする。即ち、を計算して、ＤＯＵＴ１〜ＤＯＵＴ１００を出力画像デ
ータとして出力する場合を考えることにする。ここで、
ＣＯＥａ〜ＣＯＥｅは定数である。Now, as an example of the above "desired calculation",
Consider a case where a filter is applied to the input image data (Di1 to Di100) shown in FIG. The filter coefficient is, for example, as shown in FIG. That is, Will be calculated and DOUT1 to DOUT100 will be output as output image data. here,
COEa to COEe are constants.

【００２５】ワード（画素）シリアルに供給される入力
画像データ（Ｄｉ１〜Ｄｉ１００）は、ＳＰ１０１によ
り、パラレル化され、それぞれ、ＰＥ１０２（１）〜Ｐ
Ｅ１０２（１００）に供給される。The input image data (Di1 to Di100) supplied in word (pixel) serial are parallelized by SP101, and PE102 (1) to PE102 (P), respectively.
E102 (100) is supplied.

【００２６】続く１水平水平期間の間（図１６の時刻Ｔ
３）にフィルタ演算を行う。即ち、各ＰＥ１０２（ｊ）
（ｊ＝１〜１００）は、時刻Ｔ２において、受け取った
データＤｉｊをメモリ１２１に格納し、時刻Ｔ３内で、
このメモリ１２１からデータを読みだし、演算回路１２
２にてフィルタ演算を行い、その演算結果（ＤＯＵＴ
ｊ）をメモリ１２１に再格納する。この演算操作を以下
に具体的に示す。During the following one horizontal horizontal period (time T in FIG. 16).
Filter calculation is performed in 3). That is, each PE 102 (j)
(J = 1 to 100) stores the received data Dij in the memory 121 at time T2, and within time T3,
The data is read from the memory 121, and the arithmetic circuit 12
2 performs the filter calculation, and the calculation result (DOUT
j) is stored again in the memory 121. This arithmetic operation is specifically shown below.

【００２７】まず、各ＰＥ１０２（ｊ）は、ＳＰ１０１
より受け取ったデータＤｉｊを左隣りに与える。各ＰＥ
１０２（ｊ−１）は、このデータＤｉｊをメモリ１２１
に格納する。次に、各ＰＥ１０２（ｊ−１）は、今、メ
モリ１２１に格納したデータＤｉｊをさらに左隣りに与
える。各ＰＥ１０２（ｊ−２）は、このデータＤｉｊを
メモリ１２１に格納する。First, each PE 102 (j) has an SP 101
The received data Dij is given to the left. Each PE
102 (j-1) stores this data Dij in the memory 121.
To be stored. Next, each PE 102 (j-1) further supplies the data Dij stored in the memory 121 to the left side. Each PE 102 (j-2) stores this data Dij in the memory 121.

【００２８】次に、各ＰＥ１０２（ｊ）は、ＳＰ１０１
より受け取ったデータＤｉｊを右隣りに与える。各ＰＥ
１０２（ｊ＋１）は、このデータＤｉｊをメモリ１２１
に格納する。次に、各ＰＥ１０２（ｊ＋１）は、今、メ
モリ１２１に格納したデータＤｉｊをさらに右隣りに与
える。各ＰＥ１０２（ｊ＋２）は、このデータＤｉｊを
メモリ１２１に格納する。Next, each PE 102 (j) has an SP 101
The received data Dij is given to the right. Each PE
102 (j + 1) stores this data Dij in the memory 121.
To be stored. Next, each PE 102 (j + 1) further supplies the data Dij stored in the memory 121 to the right next. Each PE 102 (j + 2) stores this data Dij in the memory 121.

【００２９】この一連の操作により、各ＰＥ１０２
（ｊ）では、ＳＰ１０１より受け取ったデータＤｉｊ
と、隣のＰＥ１０２（ｊ−１）及びＰＥ１０２（ｊ−
２）、ＰＥ１０２（ｊ＋１）及びＰＥ１０２（ｊ＋２）
から受け取ったデータＤｉｊ−１、Ｄｉｊ−２、Ｄｉｊ
＋１、Ｄｉｊ＋２がメモリ１２１に格納されていること
になる。By this series of operations, each PE 102 is
In (j), the data Dij received from SP101
And adjacent PE 102 (j-1) and PE 102 (j-
2), PE102 (j + 1) and PE102 (j + 2)
Data received from Dij-1, Dij-2, Dij
That is, +1 and Dij + 2 are stored in the memory 121.

【００３０】各ＰＥ１０２（ｊ）は、メモリ１２１か
ら、順次、データＤｉｊ−１、Ｄｉｊ−２、Ｄｉｊ＋
１、Ｄｉｊ＋２を読みだし、ＰＥ１０２（ｊ）内の演算
回路１２２に供給し、演算回路１２２では、これらデー
タに順にＣＯＥａ、ＣＯＥｂ、ＣＯＥｃ、ＣＯＥｄ、Ｃ
ＯＥｅを乗算して、その結果を累積していく。最終的に
得られた累積結果、を再度メモリ１２１に格納する。Each PE 102 (j) sequentially receives data Dij-1, Dij-2, Dij + from the memory 121.
1 and Dij + 2 are read out and supplied to the arithmetic circuit 122 in the PE 102 (j). In the arithmetic circuit 122, COEa, COEb, COEc, COEd, C are sequentially added to these data.
OEe is multiplied and the result is accumulated. The cumulative result finally obtained, Are again stored in the memory 121.

【００３１】これで、「所望の演算」が終わる。This completes the "desired operation".

【００３２】そこで、その後の水平ブランキング期間内
において、ＤＯＵＴｊは、ＰＳ１０４により、シリアル
化されて、ＤＯＵＴ１から順番にシリアルデータ出力端
子（ＳＯＵＴ）を介して出力される。即ち、１水平期間
（１Ｈ）分のフィルタ演算結果のデータがシリアルデー
タ出力端子（ＳＯＵＴ）からワード（画素）シリアルに
出力される。Therefore, in the subsequent horizontal blanking period, DOUTj is serialized by PS104 and sequentially output from DOUT1 via the serial data output terminal (SOUT). That is, the data of the filter calculation result for one horizontal period (1H) is output in word (pixel) serial from the serial data output terminal (SOUT).

【００３３】[0033]

【発明が解決しようとする課題】さて、今度は、フィル
タ係数が、図１９に示す場合を考えてみよう。係数値が
０でないものがグループを成しており、それらグループ
の間には、係数値が０であるものが続いている。このよ
うなフィルタ係数の例は、例えば、ゴーストキャンセラ
などがあげられる。つまり、地上ＴＶ放送のゴーストを
除去するために、このような「係数値が０でないものが
グループを成すフィルタ」との乗算を行うことがある。Now, consider the case where the filter coefficient is shown in FIG. Those having a coefficient value other than 0 form a group, and those having a coefficient value of 0 continue between the groups. An example of such a filter coefficient is, for example, a ghost canceller. That is, in order to remove the ghost of the terrestrial TV broadcast, multiplication with such a "filter whose coefficient value is not 0 forms a group" may be performed.

【００３４】図１９に示すフィルタをかける場合、上述
の図１８に示す５Ｔａｐのフィルタの例から類推できる
ように、ＰＥ１０２（ｊ）での演算にかかる時間が長く
なり過ぎてしまう。そのため、１水平水平期間の間（図
１６の時刻Ｔ３）で終わらせることが出来なくなり、計
算不可能となってしまう。When applying the filter shown in FIG. 19, as can be inferred from the example of the 5 Tap filter shown in FIG. 18, the time required for the calculation in the PE 102 (j) becomes too long. Therefore, it cannot be completed within one horizontal horizontal period (time T3 in FIG. 16), which makes calculation impossible.

【００３５】つまり、ＰＥ１０２（ｊ）では、を計算しなくてはいけない。ところが、例えば、Ｄｉｊ
−１５は、図１６の時刻Ｔ２において、ＰＥ１０２（ｊ
−１５）に格納されており、これをＰＥ１０２（ｊ）で
使用するために、このデータをＰＥ１０２（ｊ−１
４）、ＰＥ１０２（ｊ−１３）、…、ＰＥ１０２（ｊ−
１）と順次右隣りに送っていかなくてはいけない。この
プロセッサエレメント間のデータの受渡しに時間がかか
り過ぎてしまう。従って、この計算は、時間がかかり過
ぎてしまう。That is, in PE 102 (j), You have to calculate. However, for example, Dij
-15 is PE 102 (j) at time T2 in FIG.
-15), and the data is stored in PE 102 (j-1) for use by PE 102 (j).
4), PE102 (j-13), ..., PE102 (j-)
I have to send it to the right next to 1). It takes too much time to transfer the data between the processor elements. Therefore, this calculation is too time consuming.

【００３６】このデータの転送時間を短くするために
は、１５個隣りのプロセッサエレメントとデータを一度
に、受渡し出来るように配線しておけば良いが、これで
は配線量が増え、回路規模が大きくなり過ぎてしまい、
非現実的である。In order to shorten the data transfer time, it is sufficient to wire 15 adjacent processor elements so that the data can be transferred at one time, but this increases the wiring amount and the circuit scale is large. Too much,
Unrealistic.

【００３７】即ち、例えば、ゴーストキャンセラなどで
使用されるような、「係数値が０でないものがグループ
を成すフィルタ」との乗算を行うとき、従来の構成であ
る並列プロセッサでは、データの転送時間がかかり過ぎ
てしまい、１水平水平期間の間（図１６の時刻Ｔ３）に
計算を終わらせることが出来ないという問題がある。That is, for example, when performing multiplication with "a filter whose coefficient value is not 0 forms a group", which is used in a ghost canceller, etc., a parallel processor having a conventional configuration requires a data transfer time. However, there is a problem in that the calculation cannot be completed during one horizontal horizontal period (time T3 in FIG. 16) because it takes too much time.

【００３８】本発明は、上記事情に鑑みてなされたもの
であり、ゴーストキャンセラなどで使用されるような係
数値が０でないものがグループを成すフィルタとの乗算
を効率的に行うことのできる並列プロセッサを提供する
ことを目的としている。The present invention has been made in view of the above circumstances, and it is possible to efficiently perform multiplication with a filter whose coefficient value is not 0, which is used in a ghost canceller or the like, forms a group. It is intended to provide a processor.

【００３９】[0039]

【課題を解決するための手段】請求項１に記載の並列プ
ロセッサは、シリアルに入力されてくる複数の第１のデ
ータをパラレル変換する複数のシリアル／パラレル変換
手段と、複数のシリアル／パラレル変換手段からのパラ
レル化された第１のデータを並列に入力し演算処理する
複数の演算手段と、演算手段により演算されて生成され
た第２のデータを並列に入力し、第２のデータをシリア
ル変換してシリアルに出力するパラレル／シリアル変換
手段とを備えたことを特徴とする。According to a first aspect of the present invention, there is provided a parallel processor, comprising: a plurality of serial / parallel conversion means for converting a plurality of serially input first data into parallel data; and a plurality of serial / parallel conversion means. A plurality of arithmetic means for inputting parallelized first data from the means in parallel and arithmetic processing, and second data generated by arithmetic operation by the arithmetic means are input in parallel, and the second data is serialized. A parallel / serial conversion means for converting and outputting serially is provided.

【００４０】複数のシリアル／パラレル変換手段は、第
１のデータを同じタイミングで入力すると共に、複数の
シリアル／パラレル変換手段間で、タイミングをずらし
てパラレル変換することができる。The plurality of serial / parallel conversion means can input the first data at the same timing and can perform the parallel conversion by shifting the timing between the plurality of serial / parallel conversion means.

【００４１】第１のデータは映像信号であり、演算手段
にて演算処理される演算は、ゴーストキャンセラのため
のフィルタ処理であることができる。The first data is a video signal, and the calculation performed by the calculation means can be a filter process for a ghost canceller.

【００４２】請求項４に記載の並列プロセッサは、シリ
アルに入力されてくる複数の第１のデータをパラレル変
換するシリアル／パラレル変換手段と、シリアル／パラ
レル変換手段からのパラレル化された第１のデータを並
列に入力し演算処理する複数の演算手段と、演算手段に
より演算されて生成された第２のデータを並列に入力
し、第２のデータをシリアル変換してシリアルに出力す
るパラレル／シリアル変換手段とを備えた並列プロセッ
サにおいて、シリアル／パラレル変換器は複数あり、複
数のシリアル／パラレル変換器の内の少なくとも１つ
は、レジスタを直列に接続し、レジスタの初段に第１の
データを入力し、各レジスタ出力からパラレル化された
第１のデータを取り出すことを特徴とする。According to a fourth aspect of the present invention, there is provided a parallel processor in which serial / parallel conversion means for converting a plurality of first data input in serial into parallel data and first parallelized data from the serial / parallel conversion means. Parallel / serial inputting a plurality of arithmetic means for inputting data in parallel and arithmetically operating, and second data inputted in parallel by the arithmetic means, serially converting the second data and serially outputting the second data In the parallel processor provided with the conversion means, there are a plurality of serial / parallel converters, and at least one of the plurality of serial / parallel converters has a register connected in series and the first data is provided in the first stage of the register. It is characterized in that the first data which is input and parallelized is taken out from each register output.

【００４３】請求項５に記載の並列プロセッサは、シリ
アルに入力されてくる複数の第１のデータを遅延させる
データ遅延手段と、遅延手段からの遅延された第１のデ
ータをパラレル変換するシリアル／パラレル変換手段
と、シリアル／パラレル変換手段にてパラレル化された
第１のデータを並列に入力し、演算処理する複数の演算
手段と、演算手段により演算されて生成された第２のデ
ータを並列に入力し、第２のデータをシリアル変換して
シリアルに出力するパラレル／シリアル変換手段とを備
えたことを特徴とする。According to a fifth aspect of the present invention, there is provided a parallel processor in which a data delay means for delaying a plurality of serially input first data and a serial / serial converter for converting the delayed first data from the delay means into parallel data. The parallel conversion means and a plurality of arithmetic means for inputting the first data parallelized by the serial / parallel conversion means in parallel and performing arithmetic processing, and the second data generated by arithmetic operation by the arithmetic means in parallel. And parallel / serial conversion means for serially converting the second data to serially output the second data.

【００４４】第１のデータの内、演算手段において演算
処理に必要でないデータが、入力されてくる時間と同程
度の時間を、遅延手段で遅延させることができる。Of the first data, the delay unit can delay the time when the data that is not necessary for the arithmetic processing in the arithmetic unit is input.

【００４５】[0045]

【作用】請求項１の並列プロセッサでは、複数のシリ
アル／パラレル変換手段によりシリアル／パラレル変換
のタイミングをずらし、複数の演算手段により演算処理
を行うことで、ゴーストキャンセラなどで使用されるよ
うな係数値が０でないものがグループを成すフィルタと
の乗算を効率的に行うことを可能とする。According to the parallel processor of claim 1, the serial / parallel conversion means shifts the timing of the serial / parallel conversion, and the plural arithmetic means perform arithmetic processing so that the parallel processor is used in a ghost canceller or the like. A coefficient having a non-zero coefficient value can efficiently perform multiplication with a filter forming a group.

【００４６】請求項４の並列プロセッサでは、複数のレ
ジスタを直列に接続したシリアル／パラレル変換手段
が、レジスタの初段に第１のデータを入力し、各レジス
タ出力からパラレル化された第１のデータを取り出し
て、シリアル／パラレル変換のタイミングをずらし、演
算手段により演算処理を行うことで、ゴーストキャンセ
ラなどで使用されるような係数値が０でないものがグル
ープを成すフィルタとの乗算を効率的に行うことを可能
とする。According to another aspect of the parallel processor of the present invention, the serial / parallel conversion means in which a plurality of registers are connected in series inputs the first data to the first stage of the registers, and the parallelized first data is output from each register output. Is taken out, the timing of serial / parallel conversion is shifted, and arithmetic processing is performed by the arithmetic means to efficiently perform multiplication with a filter whose coefficient value that is not 0, such as used in a ghost canceller, forms a group. To be able to do.

【００４７】請求項５の並列プロセッサでは、データ遅
延手段により時間的に充分離れて入力されてくるデータ
を遅延させてシリアル／パラレル変換し、複数の演算手
段により演算処理を行うことで、ゴーストキャンセラな
どで使用されるような係数値が０でないものがグループ
を成すフィルタとの乗算を効率的に行うことを可能とす
る。According to another aspect of the parallel processor of the present invention, the ghost canceller is provided by delaying the data input by the data delaying means with a sufficient time difference and performing serial / parallel conversion, and performing arithmetic processing by the plurality of arithmetic means. When the coefficient value is not 0 as used in, it is possible to efficiently perform multiplication with a filter forming a group.

【００４８】[0048]

【実施例】以下、図面を参照しながら本発明の実施例に
ついて述べる。Embodiments of the present invention will be described below with reference to the drawings.

【００４９】図１乃至図３は本発明の第１実施例に係わ
り、図１は並列プロセッサの構成を示すブロック図、図
２は図１のプロセッサエレメントの構成を示すブロック
図、図３は図１の並列プロセッサの動作を説明するタイ
ミングチャートである。1 to 3 relate to the first embodiment of the present invention, FIG. 1 is a block diagram showing the configuration of a parallel processor, FIG. 2 is a block diagram showing the configuration of the processor element of FIG. 1, and FIG. 3 is a timing chart for explaining the operation of the parallel processor of No. 1.

【００５０】従来は、図１０に示すようにシリアル／パ
ラレル変換器が１つであったが、本実施例においては、
シリアル／パラレル変換器を複数個用意している。具体
的には、例えば、図１に示すように３個のシリアル／パ
ラレル変換器（ＳＰ）１、２、３を有している。ＳＰ
１、ＳＰ２、ＳＰ３は、それぞれシリアル／パラレル変
換器である。ＳＰ１、ＳＰ２、ＳＰ３は、それぞれ、従
来例である図１１に示したＳＰ１０１と同じ構成（図１
１参照）である。Conventionally, there was one serial / parallel converter as shown in FIG. 10, but in this embodiment,
Multiple serial / parallel converters are available. Specifically, for example, as shown in FIG. 1, it has three serial / parallel converters (SP) 1, 2, and 3. SP
1, SP2, SP3 are serial / parallel converters, respectively. Each of SP1, SP2, and SP3 has the same configuration as SP101 shown in FIG. 11 which is a conventional example (see FIG. 1).
1)).

【００５１】図１において、符号４（１）〜４（１０
０）は、プロセッサエレメント（ＰＥ）であり、従来例
である図１３に示したＰＥ１０２（１）〜ＰＥ１０２
（１００）とほぼ同じ構成である。但し、各ＰＥ４
（ｊ）（ｊ＝１〜１００）は、従来例のＰＥ１０２
（ｊ）では図１３に示したように、シリアル／パラレル
変換器からの入力端子が１つ（ＰＥＩＮ）であったのに
対し、本実施例である図１のＰＥ４（ｊ）は、図２に示
すようにＳＰ１、ＳＰ２、ＳＰ３からのデータを受け取
れるように、入力端子が３個（ＰＥＩＮ１、ＰＥＩＮ
２、ＰＥＩＮ３）ある。それぞれの入力端子（ＰＥＩＮ
１、ＰＥＩＮ２、ＰＥＩＮ３）からのデータは、従来例
と同様にメモリ１２１に格納できるようになっている。In FIG. 1, reference numerals 4 (1) to 4 (10
Reference numeral 0) is a processor element (PE), which is a conventional example PE102 (1) to PE102 shown in FIG.
It has almost the same configuration as (100). However, each PE4
(J) (j = 1 to 100) is the PE 102 of the conventional example.
In (j), as shown in FIG. 13, the number of input terminals from the serial / parallel converter is one (PEIN), whereas in the present embodiment, PE4 (j) in FIG. 3 input terminals (PEIN1, PEIN) so that data from SP1, SP2, SP3 can be received.
2, PEIN 3) Each input terminal (PEIN
The data from 1, PEIN2, PEIN3) can be stored in the memory 121 as in the conventional example.

【００５２】また、従来と同様に、ＰＥ４（１）〜ＰＥ
４（１００）を制御するための制御回路１０３は、１つ
のみであり、１００個全てのプロセッサエレメントに共
通のものである。即ち、図１は１水平期間分のデータ数
（１００）と同じ個数のプロセッサエレメントを有する
ＳＩＭＤ（Single Instruction Multiple Data）方式の
並列プロセッサである。Further, as in the conventional case, PE4 (1) to PE4
There is only one control circuit 103 for controlling 4 (100), which is common to all 100 processor elements. That is, FIG. 1 shows a SIMD (Single Instruction Multiple Data) parallel processor having the same number of processor elements as the number of data (100) for one horizontal period.

【００５３】なお、その他の部分についての説明は、従
来例で述べたものと同様であり、同一記号を付し、その
説明を省略する。The description of the other parts is the same as that described in the conventional example, the same symbols are given, and the description thereof is omitted.

【００５４】さて、本実施例である図１の構成で、図１
９に示すフィルタをかける場合の動作説明を、図３を用
いて以下に詳しく行う。In the configuration of FIG. 1 which is the present embodiment,
The operation of applying the filter shown in FIG. 9 will be described in detail below with reference to FIG.

【００５５】映像信号はＳＰ１のシリアルデータ入力端
子（ＳＩＮ）、ＳＰ２のシリアルデータ入力端子（ＳＩ
Ｎ）、ＳＰ３のシリアルデータ入力端子（ＳＩＮ）に共
通に、そして同時にワード（画素）シリアルに供給する
（図３の時刻Ｔ１１）。The video signal is a serial data input terminal (SIN) of SP1 and a serial data input terminal (SI of SP2).
N), common to the serial data input terminal (SIN) of SP3, and at the same time, word (pixel) serial data is supplied (time T11 in FIG. 3).

【００５６】そして、図３（ａ）に示す１水平期間の最
初のデータが入力される１５サイクル前に、図３（ｂ）
に示すように、ＳＰ１のライトポインタ入力端子（ＷＰ
ＴＲ）から、オン信号を入力する。これにより、図１１
を参照すると、ＳＰ１において、最初の入力データ（Ｄ
ｉ１）は、オン信号が遅延素子１１２（１６）まで送ら
れているので、スイッチ１１３（１６）がオンになり、
レジスタ１１１（１６）に格納される。次の入力データ
（Ｄｉ２）は、オン信号が遅延素子１１２（１７）に送
られていてスイッチ１１３（１７）がオンとなるので、
レジスタ１１１（１７）に格納される。以降、同様にし
て入力データ（Ｄｉ３〜Ｄｉ８５）が、レジスタ１１１
（１８）〜レジスタ１１１（１００）に格納される。即
ち、Ｄｉ１が入力される前の１５個のダミーデータとＤ
ｉ１〜Ｄｉ８５のデータが、それぞれＳＰ１のレジスタ
１１１（１）〜レジスタ１１１（１００）に格納され
る。Then, 15 cycles before the first data of one horizontal period shown in FIG. 3A is input, the data shown in FIG.
As shown in, the write pointer input terminal (WP
TR), input an ON signal. As a result, FIG.
, The first input data (D
In i1), since the ON signal is sent to the delay element 112 (16), the switch 113 (16) is turned ON,
It is stored in the register 111 (16). Since the ON signal of the next input data (Di2) is sent to the delay element 112 (17) and the switch 113 (17) is turned on,
It is stored in the register 111 (17). Thereafter, similarly, the input data (Di3 to Di85) is transferred to the register 111
(18) to the register 111 (100). That is, 15 dummy data before D1 is input and D
The data of i1 to Di85 are stored in the registers 111 (1) to 111 (100) of SP1, respectively.

【００５７】そして、格納された後（図３の時刻Ｔ１
２）で、図３（ｅ）に示すように、ＳＰ１のレジスタ１
１１（１）〜レジスタ１１１（１００）内のデータは、
それぞれ、ＰＥ４（１）〜ＰＥ４（１００）のＰＥＩＮ
１を介して、ＰＥ４（１）〜ＰＥ４（１００）のメモリ
１２１に格納される（図２参照）。After being stored (at time T1 in FIG. 3)
2), register 1 of SP1 as shown in FIG.
The data in 11 (1) to the register 111 (100) is
PEIN of PE4 (1) to PE4 (100), respectively
1 is stored in the memory 121 of PE4 (1) to PE4 (100) (see FIG. 2).

【００５８】また、図３（ａ）における１水平期間の最
初のデータが入力される直前に、図３（ｃ）に示すよう
に、ＳＰ２のライトポインタ入力端子（ＷＰＴＲ）か
ら、オン信号を入力する。これにより、図１１を参照す
ると、ＳＰ２において、最初の入力データ（Ｄｉ１）
は、オン信号が遅延素子１１２（１）に送られていてス
イッチ１１３（１）がオンになるので、レジスタ１１１
（１）に格納される。次の入力データ（Ｄｉ２）は、オ
ン信号が遅延素子１１２（２）に送られていてスイッチ
１１３（２）がオンとなるので、レジスタ１１１（２）
に格納される。以降、同様にして入力データ（Ｄｉ３〜
Ｄｉ１００）が、レジスタ１１１（３）〜レジスタ１１
１（１００）に格納される。即ち、１水平期間（１Ｈ）
分のデータＤｉ１〜Ｄｉ１００が、それぞれＳＰ２のレ
ジスタ１１１（１）〜レジスタ１１１（１００）に格納
される。Immediately before the first data in one horizontal period in FIG. 3A is input, an ON signal is input from the write pointer input terminal (WPTR) of SP2, as shown in FIG. 3C. To do. Thus, referring to FIG. 11, in SP2, the first input data (Di1)
Of the register 111 because the ON signal is sent to the delay element 112 (1) and the switch 113 (1) is turned on.
It is stored in (1). As for the next input data (Di2), the ON signal is sent to the delay element 112 (2) and the switch 113 (2) is turned on. Therefore, the register 111 (2)
Stored in. After that, input data (Di3 ...
Di100) is the register 111 (3) to the register 11
1 (100). That is, one horizontal period (1H)
The minute data Di1 to Di100 are stored in the registers 111 (1) to 111 (100) of the SP2, respectively.

【００５９】そして、格納された後（図３の時刻Ｔ１
３）で、図３（ｅ）に示すように、ＳＰ２のレジスタ１
１１（１）〜レジスタ１１１（１００）内のデータは、
それぞれ、ＰＥ４（１）〜ＰＥ４（１００）のＰＥＩＮ
２を介して、ＰＥ４（１）〜ＰＥ４（１００）のメモリ
１２１に格納される（図２参照）。After being stored (at time T1 in FIG. 3)
3), as shown in FIG. 3E, register 1 of SP2
The data in 11 (1) to the register 111 (100) is
PEIN of PE4 (1) to PE4 (100), respectively
2 is stored in the memory 121 of PE4 (1) to PE4 (100) (see FIG. 2).

【００６０】また、図３（ａ）における１水平期間の最
初のデータが入力された１５サイクル後に、図３（ｄ）
に示すように、ＳＰ３のライトポインタ入力端子（ＷＰ
ＴＲ）から、オン信号を入力する。これにより、図１１
参照すると、ＳＰ３において、１６番目の入力データ
（Ｄｉ１６）は、オン信号が遅延素子１１２（１）に送
られていてスイッチ１１３（１）がオンになるので、レ
ジスタ１１１（１）に格納される。次の入力データ（Ｄ
ｉ１７）は、オン信号が遅延素子１１２（２）に送られ
ていてスイッチ１１３（１）がオンとなるので、レジス
タ１１１（２）に格納される。以降、同様にして入力デ
ータ（Ｄｉ１８〜Ｄｉ１００）が、レジスタ１１１
（３）〜レジスタ１１１（８５）に格納される。そし
て、Ｄｉ１００の後に入力された１５個のダミーデータ
が、レジスタ１１１（８６）〜レジスタ１１１（１０
０）に格納される。即ち、Ｄｉ１６〜Ｄｉ１００と１５
個のダミーデータが、それぞれＳＰ３のレジスタ１１１
（１）〜レジスタ１１１（１００）に格納される。Further, 15 cycles after the first data of one horizontal period shown in FIG. 3A is input, the data shown in FIG.
As shown in, the write pointer input terminal (WP
TR), input an ON signal. As a result, FIG.
Referring to SP16, the 16th input data (Di16) is stored in the register 111 (1) because the ON signal is sent to the delay element 112 (1) and the switch 113 (1) is turned ON. . Next input data (D
i17) is stored in the register 111 (2) because the ON signal is sent to the delay element 112 (2) and the switch 113 (1) is turned on. Thereafter, similarly, the input data (Di18 to Di100) is transferred to the register 111
(3) to the register 111 (85). Then, 15 pieces of dummy data input after Di100 are registered in the registers 111 (86) to 111 (10).
0). That is, Di16 to Di100 and 15
Each of the dummy data is the register 111 of SP3.
(1) to the register 111 (100).

【００６１】そして、格納された後（図３の時刻Ｔ１
４）で、図３（ｅ）に示すように、ＳＰ３のレジスタ１
１１（１）〜レジスタ１１１（１００）内のデータは、
それぞれ、ＰＥ４（１）〜ＰＥ４（１００）のＰＥＩＮ
３を介して、ＰＥ４（１）〜ＰＥ４（１００）のメモリ
１２１に格納される。After being stored (at time T1 in FIG. 3)
4), as shown in FIG. 3E, register 1 of SP3
The data in 11 (1) to the register 111 (100) is
PEIN of PE4 (1) to PE4 (100), respectively
3 is stored in the memory 121 of PE4 (1) to PE4 (100).

【００６２】これらのデータが格納された後に、所望の
演算を行う（後述）。即ち、図２において、各ＰＥ４
（ｊ）（ｊ＝１〜１００）は、時刻Ｔ１２、１３、１４
において、受け取ったデータＤｉｊ−１５、Ｄｉｊ、Ｄ
ｉｊ＋１５を、メモリ１２１から読みだし、演算回路１
２２にて演算を行い、その演算結果（ＤＯＵＴｉ）をメ
モリ１２１に再格納する。この制御は制御回路１０３に
より行われる。After these data are stored, a desired calculation is performed (described later). That is, in FIG.
(J) (j = 1 to 100) is the time T12, 13, 14
In the received data Dij-15, Dij, D
ij + 15 is read from the memory 121, and the arithmetic circuit 1
The calculation is performed at 22, and the calculation result (DOUTi) is stored again in the memory 121. This control is performed by the control circuit 103.

【００６３】演算された後は、従来と同様に、ＤＯＵＴ
１〜ＤＯＵＴ１００が、図１４を参照すると、ＰＳ１０
４のパラレルデータ入力端子（ＰＩＮ１〜ＰＩＮ１０
０）を介して、ＰＳ１０４内のレジスタ１３１（１）〜
レジスタ１３１（１００）に供給される。そして、１水
平期間（１Ｈ）分の演算結果のデータ（ＤＯＵＴ１〜Ｄ
ＯＵＴ１００）がシリアル化されて、シリアルデータ出
力端子（ＳＯＵＴ）からワード（画素）シリアルに出力
される。After the calculation, DOUT is output as in the conventional case.
1 to DOUT100, PS10
4 parallel data input terminals (PIN1 to PIN10
0) through the registers 131 (1) to 131 in the PS104.
It is supplied to the register 131 (100). Then, the data (DOUT1 to DOUT) of the calculation result for one horizontal period (1H)
OUT100) is serialized, and is output in word (pixel) serial from the serial data output terminal (SOUT).

【００６４】上述の「所望の演算」について、詳しく述
べる。The above "desired calculation" will be described in detail.

【００６５】まず、時刻Ｔ１２以降に、各ＰＥ４（ｊ）
は、時刻Ｔ１２にＳＰ１より受け取ったデータＤｉｊ−
１５を左隣りに与える。各ＰＥ４（ｊ−１）は、このデ
ータＤｉｊ−１５をメモリ１２１に格納する。First, after time T12, each PE4 (j)
Is the data Dij− received from SP1 at time T12.
Give 15 to the left. Each PE 4 (j-1) stores this data Dij-15 in the memory 121.

【００６６】次に、各ＰＥ４（ｊ）は、時刻Ｔ１２にＳ
Ｐ１より受け取ったデータＤｉｊ−１５を右隣りに与え
る。各ＰＥ４（ｊ＋１）は、このデータＤｉｊ−１５を
メモリ１２１に格納する。Next, each PE4 (j) is set to S at time T12.
The data Dij-15 received from P1 is given to the right. Each PE 4 (j + 1) stores this data Dij-15 in the memory 121.

【００６７】この一連の操作により、各ＰＥ４（ｊ）に
は、ＳＰ１より受け取ったデータＤｉｊ−１５と、隣の
プロセッサエレメントから受け取ったデータＤｉｊ−１
６、Ｄｉｊ−１４がメモリ１２１に格納されていること
になる。Through this series of operations, each PE4 (j) receives the data Dij-15 received from SP1 and the data Dij-1 received from the adjacent processor element.
6 and Dij-14 are stored in the memory 121.

【００６８】次に、時刻Ｔ１３以降に、各ＰＥ４（ｊ）
は、時刻Ｔ１３にＳＰ２より受け取ったデータＤｉｊを
左隣りに与える。各ＰＥ４（ｊ−１）は、このデータＤ
ｉｊをメモリ１２１に格納する。次に、各ＰＥ４（ｊ−
１）は、今、メモリ１２１に格納したデータＤｉｊをさ
らに左隣りに与える。各ＰＥ４（ｊ−２）は、このデー
タＤｉｊをメモリ１２１に格納する。Next, after time T13, each PE4 (j)
Gives the data Dij received from SP2 at the time T13 to the left. Each PE4 (j-1) uses this data D
ij is stored in the memory 121. Next, each PE4 (j-
In 1), the data Dij now stored in the memory 121 is further provided to the left. Each PE 4 (j-2) stores this data Dij in the memory 121.

【００６９】次に、各ＰＥ４（ｊ）は、時刻Ｔ１３にＳ
Ｐ２より受け取ったデータＤｉｊを右隣りに与える。各
ＰＥ４（ｊ＋１）は、このデータＤｉｊをメモリ１２１
に格納する。次に、各ＰＥ（ｊ＋１）は、今、メモリ１
２１に格納したデータＤｉｊをさらに右隣りに与える。
各ＰＥ４（ｊ＋２）は、このデータＤｉｊをメモリ１２
１に格納する。Next, each PE4 (j) makes S at time T13.
The data Dij received from P2 is given to the right. Each PE 4 (j + 1) stores this data Dij in the memory 121.
To be stored. Next, each PE (j + 1) is now in memory 1
The data Dij stored in 21 is further provided on the right side.
Each PE4 (j + 2) stores this data Dij in the memory 12
Store in 1.

【００７０】この一連の操作により、各ＰＥ４（ｊ）に
は、ＳＰ２より受け取ったデータＤｉｊと、隣のプロセ
ッサエレメントから受け取ったデータＤｉｊ−１、Ｄｉ
ｊ−２、Ｄｉｊ＋１、Ｄｉｊ＋２がメモリ１２１に格納
されていることになる。Through this series of operations, each PE4 (j) receives data Dij received from SP2 and data Dij-1 and Di received from the adjacent processor element.
That is, j−2, Dij + 1, and Dij + 2 are stored in the memory 121.

【００７１】次に、時刻Ｔ１４以降に、各ＰＥ４（ｊ）
は、時刻Ｔ１４にＳＰ３より受け取ったデータＤｉｊ＋
１５を左隣りに与える。各ＰＥ４（ｊ−１）は、このデ
ータＤｉｊ＋１５をメモリ１２１に格納する。Next, after time T14, each PE4 (j)
Is the data Dij + received from SP3 at time T14
Give 15 to the left. Each PE 4 (j-1) stores this data Dij + 15 in the memory 121.

【００７２】次に、各ＰＥ４（ｊ）は、時刻Ｔ１４にＳ
Ｐ３より受け取ったデータＤｉｊ＋１５を右隣りに与え
る。各ＰＥ（ｊ＋１）は、このデータＤｉｊ＋１５をメ
モリ１２１に格納する。Next, each PE4 (j) is set to S at time T14.
The data Dij + 15 received from P3 is given to the right. Each PE (j + 1) stores this data Dij + 15 in the memory 121.

【００７３】この一連の操作により、各ＰＥ４（ｊ）に
は、ＳＰ３より受け取ったデータＤｉｊ＋１５と、隣の
プロセッサエレメントから受け取ったデータＤｉｊ＋１
４、Ｄｉｊ＋１６がメモリ１２１に格納されていること
になる。Through this series of operations, each PE4 (j) receives data Dij + 15 received from SP3 and data Dij + 1 received from the adjacent processor element.
4 and Dij + 16 are stored in the memory 121.

【００７４】以上の操作により、各ＰＥ４（ｊ）のメモ
リ１２１には、Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉｊ−
１４、Ｄｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋１、
Ｄｉｊ＋２、Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉｊ＋１
６が格納される。By the above operation, Dij-16, Dij-15, Dij- are stored in the memory 121 of each PE 4 (j).
14, Dij-2, Dij-1, Dij, Dij + 1,
Dij + 2, Dij + 14, Dij + 15, Dij + 1
6 is stored.

【００７５】各ＰＥ４（ｊ）は、メモリ１２１から、順
次、Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉｊ−１４、Ｄｉ
ｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋１、Ｄｉｊ＋
２、Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉｊ＋１６を読み
だし、ＰＥ４（ｊ）内の演算回路１２２に供給し、演算
回路１２２では、これらデータに順にＣＯＥａ０、ＣＯ
Ｅｂ０、ＣＯＥｃ０、ＣＯＥａ１、ＣＯＥｂ１、ＣＯＥ
ｃ１、ＣＯＥｄ１、ＣＯＥｅ１、ＣＯＥａ２、ＣＯＥｂ
２、ＣＯＥｃ２を乗算して、その結果を累積していく。
最終的に得られた累積結果、を再度メモリ１２１に格納する。Each PE 4 (j) sequentially receives Dij-16, Dij-15, Dij-14, and Di from the memory 121.
j-2, Dij-1, Dij, Dij + 1, Dij +
2, Dij + 14, Dij + 15, Dij + 16 are read out and supplied to the arithmetic circuit 122 in PE4 (j). In the arithmetic circuit 122, COEa0, COE are sequentially added to these data.
Eb0, COEc0, COEa1, COEb1, COE
c1, COEd1, COEe1, COEa2, COEb
2, COEc2 is multiplied and the result is accumulated.
The cumulative result finally obtained, Are again stored in the memory 121.

【００７６】以上で、「所望の演算」についての説明を
終わる。This is the end of the description of the "desired calculation".

【００７７】従来では、時間的に充分離れて入力されて
くるデータ（「Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉｊ−
１４」と「Ｄｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋
１Ｄｉｊ＋２」と「Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉ
ｊ＋１６」）を使用して演算する場合でも、隣り合うプ
ロセッサエレメントとデータの受渡しを行うことによ
り、必要なデータを各プロセッサエレメントに持ってこ
なくてはいけなかった。Conventionally, data (“Dij-16, Dij-15, Dij-
14 "and" Dij-2, Dij-1, Dij, Dij + "
1Dij + 2 ”and“ Dij + 14, Dij + 15, Di
j + 16 "), it is necessary to bring necessary data to each processor element by exchanging data with adjacent processor elements.

【００７８】しかし、本発明では、時間的に充分離れて
入力されてくるデータの場合、複数のシリアル／パラレ
ル変換器のそれぞれのタイミングをずらすことにより、
隣り合うプロセッサエレメントとデータの受渡しを行う
ことなく、必要なデータを各プロセッサエレメントある
いはその近傍のプロセッサエレメントに持ってくること
が出来るので、データの受渡し時間を短くすることが出
来る。従って、１水平水平期間の間に計算を終わらせる
ことが出来る。However, according to the present invention, in the case of data that are input with a sufficient time separation, by shifting the timing of each of the plurality of serial / parallel converters,
Since necessary data can be brought to each processor element or a processor element in the vicinity thereof without transferring data to and from the adjacent processor element, the data transfer time can be shortened. Therefore, the calculation can be completed within one horizontal horizontal period.

【００７９】つまり、本実施例のようにすることによ
り、異なったタイミングでシリアル／パラレル変換がで
きる。例えば、図１９に示したフィルタをかける場合な
どは、ＳＰ１、ＳＰ２、ＳＰ３の３個のシリアル／パラ
レル変換器のタイミングを１５サイクルずつ、ずらすこ
とにより、ＰＥ４（ｊ）にＳＰ１からＤｉｊ−１５を、
ＰＥ（ｊ）にＳＰ２からＤｉｊを、ＰＥｉにＳＰ３から
Ｄｉｊ＋１５を供給することが出来、従来のようにプロ
セッサエレメント間でのデータの受渡しに時間をかける
ことがなくなる。That is, according to this embodiment, serial / parallel conversion can be performed at different timings. For example, when the filter shown in FIG. 19 is applied, the timings of the three serial / parallel converters SP1, SP2, and SP3 are shifted by 15 cycles, so that PE4 (j) changes SP1 to Dij-15. ,
It is possible to supply PE (j) from SP2 to Dij and PEi from SP3 to Dij + 15, so that it does not take time to transfer data between processor elements as in the conventional case.

【００８０】なお、ＳＰ１のシリアルデータ入力端子
（ＳＩＮ）、ＳＰ２のシリアルデータ入力端子（ＳＩ
Ｎ）、ＳＰ３のシリアルデータ入力端子（ＳＩＮ）に
は、同じ共通の信号を与えるので、３つのシリアルデー
タ入力端子（ＳＩＮ）を１つにまとめて１つの入力端子
としても良い。この時、ＳＰ１のシリアルデータ入力端
子（ＳＩＮ）からＳＰ１のスイッチ１１３（１）、１１
３（２）、…、１１３（１００）へのバス線と、ＳＰ２
のシリアルデータ入力端子（ＳＩＮ）からＳＰ２のスイ
ッチ１１３（１）、１１３（２）、…、１１３（１０
０）へのバス線と、ＳＰ３のシリアルデータ入力端子
（ＳＩＮ）からＳＰ３のスイッチ１１３（１）、１１３
（２）、…、１１３（１００）へのバス線は、共通のバ
ス線となる。The serial data input terminal (SIN) of SP1 and the serial data input terminal (SI of SP2)
N), since the same common signal is given to the serial data input terminals (SIN) of SP3, the three serial data input terminals (SIN) may be combined into one to be one input terminal. At this time, from the serial data input terminal (SIN) of SP1 to the switches 113 (1), 11 of SP1.
Bus lines to 3 (2), ..., 113 (100) and SP2
From the serial data input terminal (SIN) of SP2 to switches 113 (1), 113 (2), ..., 113 (10) of SP2.
0) bus line and SP3 serial data input terminal (SIN) to SP3 switches 113 (1), 113
The bus lines to (2), ..., 113 (100) are common bus lines.

【００８１】次に第２実施例について説明する。図４乃
至図７は本発明の第２実施例に係わり、図４は並列プロ
セッサの構成を示すブロック図、図５は図４のＳＰＲの
構成を示すブロック図、図６は図５のＳＰＲの動作を説
明するタイミング図、図７は図４のプロセッサエレメン
トの構成を示すブロック図である。第２実施例は第１実
施例とほとんど同じであるので、異なる構成のみ説明
し、同一の構成には同じ符号をつけ説明は省略する。Next, the second embodiment will be described. 4 to 7 relate to the second embodiment of the present invention. FIG. 4 is a block diagram showing a configuration of a parallel processor, FIG. 5 is a block diagram showing a configuration of the SPR of FIG. 4, and FIG. 6 is a block diagram of the SPR of FIG. FIG. 7 is a timing diagram for explaining the operation, and FIG. 7 is a block diagram showing the configuration of the processor element of FIG. Since the second embodiment is almost the same as the first embodiment, only different configurations will be described, the same configurations will be denoted by the same reference numerals, and description thereof will be omitted.

【００８２】従来は、図１１に示したように並列型のシ
リアル／パラレル変換器が１つであったが、本実施例に
おいては、直列型と並列型のシリアル／パラレル変換器
を複数個用意している。Conventionally, there was one parallel type serial / parallel converter as shown in FIG. 11, but in the present embodiment, a plurality of serial type and parallel type serial / parallel converters are prepared. are doing.

【００８３】具体的には、例えば、図４に示すように１
個の並列型シリアル／パラレル変換器（ＳＰ）１と１個
の直列型シリアル／パラレル変換器（ＳＰＲ）１１を有
している。Specifically, for example, as shown in FIG.
It has one parallel type serial / parallel converter (SP) 1 and one serial type serial / parallel converter (SPR) 11.

【００８４】そして、異なったタイミング（後述する図
５の時刻Ｔ４２、Ｔ４３、Ｔ４４、Ｔ４５、Ｔ４６、Ｔ
４７、Ｔ４８）で直列型シリアル／パラレル変換器から
各ＰＥ１２（ｊ）はデータを受け取ることにより、任意
のデータを受け取ることが出来る。例えば、図１９に示
したフィルタをかける場合などは、ＳＰＲ１２から時刻
Ｔ４２のタイミングでデータを受け取ることにより、Ｐ
Ｅ１２（ｊ）にＳＰＲ１１からＤｉｊ−１５をＰＥ１２
（ｊ）に供給することが出来、従来のようにプロセッサ
エレメント間でのデータの受渡しに時間をかけることが
なくなる。At different timings (time T42, T43, T44, T45, T46, T in FIG. 5 described later).
47, T48), each PE 12 (j) can receive arbitrary data by receiving the data from the serial-type serial / parallel converter. For example, in the case of applying the filter shown in FIG. 19, by receiving data from the SPR 12 at the timing of time T42, P
E12 (j) to SPR11 to Dij-15 to PE12
(J) can be supplied, and it does not take time to transfer data between processor elements as in the conventional case.

【００８５】ここで、直列型シリアル／パラレル変換器
について説明する。Here, the serial type serial / parallel converter will be described.

【００８６】図５に示すように、直列型シリアル／パラ
レル変換器ＳＰＲ１１には、レジスタ２１（１）〜２１
（１００）がある。レジスタ２１（１）〜２１（１０
０）は、直列に接続されており、シリアルデータ入力端
子（ＳＩＮＲ）からのデータは、レジスタ２１（１０
０）、レジスタ２１（９９）、…、レジスタ２１
（２）、レジスタ２１（１）へと順次送られていく。レ
ジスタ２１（ｊ）（ｊ＝１〜１００）の出力は、パラレ
ルデータ出力端子（ＰＯＵＴＲｊ）より出力される。図
６（ａ）に示すように、時刻Ｔ４１期間にＳＩＮＲより
シリアルに入力されてくるデータＤｉ１、Ｄｉ２、Ｄｉ
３、…、Ｄｉ９９、Ｄｉ１００は、それぞれ、レジスタ
２１（１００）〜２１（１）上をシフトしていく。As shown in FIG. 5, the serial-type serial-to-parallel converter SPR11 includes registers 21 (1) to 21 (21).
There is (100). Registers 21 (1) to 21 (10
0) are connected in series, and data from the serial data input terminal (SINR) is stored in the register 21 (10
0), register 21 (99), ..., Register 21
(2), it is sequentially sent to the register 21 (1). The output of the register 21 (j) (j = 1 to 100) is output from the parallel data output terminal (POUTRj). As shown in FIG. 6A, data Di1, Di2, Di serially input from the SINR during the time T41 period.
, ..., Di99, and Di100 shift on the registers 21 (100) to 21 (1), respectively.

【００８７】即ち、各時刻において、ＰＯＵＴＲｊ（ｊ
＝１〜１００）から、図６（ｂ）に示すデータが出力さ
れる。特に、時刻Ｔ４２においては、「Ｄｉ１が入力さ
れる前の１５個のダミーデータ」と「入力データＤｉ１
〜Ｄｉ８５」が、それぞれ、ＰＯＵＴＲｊ（ｊ＝１〜１
００）から出力される。時刻Ｔ４３においては、「Ｄｉ
１が入力される前の２個のダミーデータ」と「入力デー
タＤｉ１〜Ｄｉ９８」が、それぞれ、ＰＯＵＴＲｊ（ｊ
＝１〜１００）から出力される。That is, at each time, POUTRj (j
= 1 to 100), the data shown in FIG. 6B is output. Particularly, at time T42, “15 dummy data before Di1 is input” and “input data Di1 are input.
~ Di85 "are POUTRj (j = 1 to 1), respectively.
00). At time T43, “Di
"2 dummy data before 1 is input" and "input data Di1 to Di98" are respectively POUTRj (j
= 1 to 100).

【００８８】時刻Ｔ４４においては、「Ｄｉ１が入力さ
れる前の１個のダミーデータ」と「入力データＤｉ１〜
Ｄｉ９９」が、それぞれ、ＰＯＵＴＲｊ（ｊ＝１〜１０
０）から出力される。時刻Ｔ４５においては、「入力デ
ータＤｉ１〜Ｄｉ１００」が、それぞれ、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から出力される。時刻Ｔ４６におい
ては、「入力データＤｉ２〜Ｄｉ１００」と「Ｄｉ１０
０が入力された後の１個のダミーデータ」が、それぞ
れ、ＰＯＵＴＲｊ（ｊ＝１〜１００）から出力される。
時刻Ｔ４７においては、「入力データＤｉ３〜Ｄｉ１０
０」と「Ｄｉ１００が入力された後の２個のダミーデー
タ」が、それぞれ、ＰＯＵＴＲｊ（ｊ＝１〜１００）か
ら出力される。時刻Ｔ４８においては、「入力データＤ
ｉ１６〜Ｄｉ１００」と「Ｄｉ１００が入力された後の
１５個のダミーデータ」が、それぞれ、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から出力される。At time T44, "one dummy data before Di1 is input" and "input data Di1 to Di1".
Di99 ”are POUTRj (j = 1 to 10), respectively.
0) is output. At time T45, the “input data Di1 to Di100” are respectively POUTRj.
(J = 1 to 100). At time T46, “Input data Di2 to Di100” and “Di10” are input.
One piece of dummy data after 0 is input is output from POUTRj (j = 1 to 100), respectively.
At time T47, “input data Di3 to Di10 is displayed.
"0" and "two dummy data after Di100 is input" are output from POUTRj (j = 1 to 100), respectively. At time T48, “input data D
i16 to Di100 ”and“ 15 dummy data after Di100 are input ”are respectively POUTRj
(J = 1 to 100).

【００８９】図４に戻り本実施例の並列プロセッサを、
さらに詳しく説明することにする。図４に示すプロセッ
サエレメント（ＰＥ）１２（ｊ）（ｊ＝１〜１００）、
従来例に示したＰＥ１０２（１）〜ＰＥ１０２（１０
０）とほぼ同じである。但し、各ＰＥ１２（ｊ）（ｊ＝
１〜１００）は、従来例では図１３に示したようにシリ
アル／パラレル変換器からの入力端子が１つ（ＰＥＩ
Ｎ）であったのに対し、本実施例である図４のＰＥ１２
（ｊ）は、図７に示すようにＳＰ１、ＳＰＲ１１からの
データを受け取れるように、入力端子が２個（ＰＥＩＮ
１、ＰＥＩＮ２）ある。それぞれの入力端子（ＰＥＩＮ
１、ＰＥＩＮ２）からのデータは、メモリ１２１に格納
できるようになっている。また、従来と同様に、各プロ
セッサエレメント（ＰＥ１〜ＰＥ１００）を制御するた
めの制御回路１０３は、１つのみであり、１００個全て
のプロセッサエレメントに共通のものである。即ち、図
４は１水平期間分のデータ数（１００）と同じ個数のプ
ロセッサエレメントを有するＳＩＭＤ（Single Instruc
tion Multiple Data）方式の並列プロセッサである。Returning to FIG. 4, the parallel processor of this embodiment is
It will be described in more detail. A processor element (PE) 12 (j) (j = 1 to 100) shown in FIG.
PE102 (1) to PE102 (10 shown in the conventional example
It is almost the same as 0). However, each PE 12 (j) (j =
1 to 100) has one input terminal from the serial / parallel converter (PEI) as shown in FIG. 13 in the conventional example.
N), the PE 12 of FIG.
(J) has two input terminals (PEIN) so that the data from SP1 and SPR11 can be received as shown in FIG.
1, PEIN 2) Each input terminal (PEIN
The data from 1, PEIN2) can be stored in the memory 121. Further, as in the conventional case, there is only one control circuit 103 for controlling each processor element (PE1 to PE100), and it is common to all 100 processor elements. That is, FIG. 4 shows a SIMD (Single Instrument) having the same number of processor elements as the number of data (100) for one horizontal period.
It is a parallel processor of the method Multiple Data).

【００９０】なお、その他の部分についての説明は、第
１実施例と同じであり、同一記号を付し、その説明を省
略する。The description of the other parts is the same as that of the first embodiment, and the same symbols are attached and the description thereof is omitted.

【００９１】さて、本発明の実施例である図４の構成
で、図１８に示したフィルタをかける場合の動作説明を
以下に詳しく行う。Now, the operation in the case of applying the filter shown in FIG. 18 in the configuration of FIG. 4 which is an embodiment of the present invention will be described in detail below.

【００９２】映像信号はＳＰＲのＳＩＮＲからワード
（画素）シリアルに供給する（図６の時刻Ｔ４１）。The video signal is supplied in word (pixel) serial from the SNR SINR (time T41 in FIG. 6).

【００９３】図６の時刻Ｔ４３において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれ、レジスタ２１
（ｊ）に格納されているデータを各ＰＥ１２（ｊ）のＰ
ＥＩＮ２を介して、各ＰＥ１２（ｊ）のメモリ１２１に
格納させる。これにより、ＰＥ１２（１）〜ＰＥ１２
（１００）各メモリ１２１には、それぞれ、「Ｄｉ１が
入力される前の２個のダミーデータ」と「入力データＤ
ｉ１〜Ｄｉ９８」が格納される。At time T43 in FIG. 6, POUTRj
From (j = 1 to 100), register 21
The data stored in (j) is assigned to P of each PE 12 (j).
It is stored in the memory 121 of each PE 12 (j) via EIN2. Thereby, PE12 (1) to PE12
(100) In each memory 121, “two dummy data before Di1 is input” and “input data D
i1-Di98 "is stored.

【００９４】図６の時刻Ｔ４４において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれ、レジスタ２１
（ｊ）に格納されているデータを各ＰＥ１２（ｊ）のＰ
ＥＩＮ２を介して、各ＰＥ１２（ｊ）のメモリ１２１に
格納させる。これにより、ＰＥ１２（１）〜ＰＥ１２
（１００）の各メモリ１２１には、それぞれ、「Ｄｉ１
が入力される前の１個のダミーデータ」と「入力データ
Ｄｉ１〜Ｄｉ９９」が格納される。At time T44 in FIG. 6, POUTRj
From (j = 1 to 100), register 21
The data stored in (j) is assigned to P of each PE 12 (j).
It is stored in the memory 121 of each PE 12 (j) via EIN2. Thereby, PE12 (1) to PE12
In each memory 121 of (100), "Di1
"One piece of dummy data before being input" and "input data Di1 to Di99" are stored.

【００９５】図６の時刻Ｔ４５において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれ、レジスタ２１
（ｊ）に格納されているデータを各ＰＥ１２（ｊ）のＰ
ＥＩＮ２を介して、各ＰＥ１２（ｊ）のメモリ１２１に
格納させる。これにより、ＰＥ１２（１）〜ＰＥ１２
（１００）の各メモリ１２１には、それぞれ、「入力デ
ータＤｉ１〜Ｄｉ１００」が格納される。At time T45 in FIG. 6, POUTRj
From (j = 1 to 100), register 21
The data stored in (j) is assigned to P of each PE 12 (j).
It is stored in the memory 121 of each PE 12 (j) via EIN2. Thereby, PE12 (1) to PE12
"Input data Di1 to Di100" are stored in each memory 121 of (100).

【００９６】図６の時刻Ｔ４６において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれ、レジスタ２１
（ｊ）に格納されているデータを各ＰＥ１２（ｊ）のＰ
ＥＩＮ２を介して、各ＰＥ１２（ｊ）のメモリ１２１に
格納させる。これにより、ＰＥ１２（１）〜ＰＥ１２
（１００）の各メモリ１２１には、それぞれ、「入力デ
ータＤｉ２〜Ｄｉ１００」と「Ｄｉ１００が入力された
後の１個のダミーデータ」が格納される。At time T46 in FIG. 6, POUTRj
From (j = 1 to 100), register 21
The data stored in (j) is assigned to P of each PE 12 (j).
It is stored in the memory 121 of each PE 12 (j) via EIN2. Thereby, PE12 (1) to PE12
In each memory 121 of (100), "input data Di2 to Di100" and "one piece of dummy data after Di100 is input" are stored.

【００９７】図６の時刻Ｔ４７において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれ、レジスタ２１
（ｊ）に格納されているデータを各ＰＥ１２（ｊ）のＰ
ＥＩＮ２を介して、各ＰＥ１２（ｊ）のメモリ１２１に
格納させる。これにより、ＰＥ１２（１）〜ＰＥ１２
（１００）の各メモリ１２１には、それぞれ、「入力デ
ータＤｉ３〜Ｄｉ１００」と「Ｄｉ１００が入力された
後の２個のダミーデータ」が格納される。At time T47 in FIG. 6, POUTRj
From (j = 1 to 100), register 21
The data stored in (j) is assigned to P of each PE 12 (j).
It is stored in the memory 121 of each PE 12 (j) via EIN2. Thereby, PE12 (1) to PE12
In each memory 121 of (100), "input data Di3 to Di100" and "two dummy data after Di100 are input" are stored, respectively.

【００９８】これらのデータがメモリ１２１に格納され
た後に、所望の演算を行う。即ち、各ＰＥ１２（ｊ）
（ｊ＝１〜１００）は、時刻Ｔ４３、Ｔ４４、Ｔ４５、
Ｔ４６、Ｔ４７において、受け取ったデータＤｉｊ−
２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋１、Ｄｉｊ＋２を、
メモリ１２１から順次読みだし、ＰＥ１２（ｊ）内のＡ
ＬＵに供給し、演算回路１２２では、これらデータに順
にＣＯＥａ、ＣＯＥｂ、ＣＯＥｃ、ＣＯＥｄ、ＣＯＥｅ
を乗算して、その結果を累積していく。最終的に得られ
た累積結果、を再度メモリ１２１に格納する。この制御は制御回路１
０３により行われる。After these data are stored in the memory 121, a desired calculation is performed. That is, each PE 12 (j)
(J = 1 to 100) are times T43, T44, T45,
At T46 and T47, the received data Dij-
2, Dij-1, Dij, Dij + 1, Dij + 2,
The data is sequentially read from the memory 121 and A in PE12 (j) is read.
The data is supplied to the LU, and in the arithmetic circuit 122, COEa, COEb, COEc, COEd, COEe are sequentially added to these data.
Is multiplied and the result is accumulated. The cumulative result finally obtained, Are again stored in the memory 121. This control is performed by the control circuit 1
03.

【００９９】演算された後は、従来と同様に、ＤＯＵＴ
１〜ＤＯＵＴ１００が、ＰＳ１０４のパラレルデータ入
力端子ＰＩＮ１〜ＰＩＮ１００を介して、ＰＳ１０４内
のレジスタ１３１（１）〜１３１（１００）に供給され
る。そして、１水平期間（１Ｈ）分の演算結果のデータ
（ＤＯＵＴ１〜ＤＯＵＴ１００）がシリアル化されて、
ＳＯＵＴからワード（画素）シリアルに出力される。After the calculation, DOUT is output as in the conventional case.
1 to DOUT100 are supplied to the registers 131 (1) to 131 (100) in the PS104 via the parallel data input terminals PIN1 to PIN100 of the PS104. Then, the data (DOUT1 to DOUT100) of the operation result for one horizontal period (1H) is serialized,
Word (pixel) serial output from SOUT.

【０１００】このように、直列型シリアル／パラレル変
換器ＳＰＲ１１を使用して、映像信号（Ｄｉ１〜Ｄｉ１
００）をパラレル化しているので、各時刻毎にＰＯＵＴ
ｊ（ｊ＝１〜１００）から出力されるデータが違ってく
る。上述のように時刻Ｔ４３、Ｔ４４、Ｔ４５、Ｔ４
６、Ｔ４７という異なった時刻にデータを各ＰＥ１２
（ｊ）に供給することで、各ＰＥ１２（ｊ）はデータＤ
ｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋１、Ｄｉｊ＋
２を、ＳＰＲから直接受け取ることが出来る。As described above, by using the serial type serial / parallel converter SPR11, the video signals (Di1 to Di1) can be obtained.
00) is parallelized, so POUT is changed at each time.
The data output from j (j = 1 to 100) is different. As described above, the times T43, T44, T45, T4
Data is sent to each PE12 at different times of 6 and T47.
By supplying the data to (j), each PE 12 (j) receives the data D.
ij-2, Dij-1, Dij, Dij + 1, Dij +
2 can be received directly from the SPR.

【０１０１】従って、従来のように、プロセッサエレメ
ント間でデータの受渡しを行わなくて良いので、処理時
間が短くなる。Therefore, since it is not necessary to transfer data between the processor elements as in the conventional case, the processing time is shortened.

【０１０２】勿論、従来の並列型シリアル／パラレル変
換器を用いる場合に比べて、本発明においては直列型シ
リアル／パラレル変換器を用いているので、消費電力が
大きくなる。もし、消費電力を優先させるのであれば、
図４において、ＳＰ１を使用して、ＳＰＲ１１は使用し
ないようにする。即ち、図４のＳＰ１のＳＩＮから映像
信号（Ｄｉ１〜Ｄｉ１００）を供給し、従来と同様に、
Ｄｉｊを、ＰＯＵＴｊと各ＰＥ１２（ｊ）のＰＥＩＮ１
を介して、各ＰＥ１２（ｊ）のメモリ１２１に与える。
そして、従来と同様に、プロセッサエレメント間でデー
タの受渡しを行い、計算していけば良い。Of course, since the serial type serial / parallel converter is used in the present invention, the power consumption becomes large as compared with the case of using the conventional parallel type serial / parallel converter. If you prioritize power consumption,
In FIG. 4, SP1 is used and SPR11 is not used. That is, the video signals (Di1 to Di100) are supplied from the SIN of SP1 of FIG.
Dij is POUTj and PEIN1 of each PE12 (j)
To the memory 121 of each PE 12 (j).
Then, as in the conventional case, data may be transferred between the processor elements and the calculation may be performed.

【０１０３】つまり、本実施例においては、演算時間を
優先させるときには、ＳＰＲ１１を使用して、消費電力
を優先させるときは、ＳＰ１を使用する。このように、
従来では、演算時間と消費電力のどちらを優先させるか
選択権がなかったが、本実施例においては選ぶことが出
来る。That is, in this embodiment, the SPR11 is used when the calculation time is prioritized, and the SP1 is used when the power consumption is prioritized. in this way,
Conventionally, there was no right to select which of calculation time and power consumption should be prioritized, but in the present embodiment, it can be selected.

【０１０４】次に、本実施例である図４の構成で、図１
９に示したフィルタをかける場合の動作説明を以下に詳
しく行う。Next, referring to FIG.
A detailed description will be given below of the operation when the filter shown in FIG.

【０１０５】映像信号は、ＳＰＲ１１のＳＩＮＲからワ
ード（画素）シリアルに供給する（図６の時刻Ｔ４
１）。The video signal is supplied in word (pixel) serial from the SINR of SPR11 (time T4 in FIG. 6).
1).

【０１０６】図６の時刻Ｔ４２において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれ、Ｑｉに格納されて
いるデータを各ＰＥ１２（ｊ）のＰＥＩＮ２を介して、
各ＰＥ１２（ｊ）のメモリ１２１メモリ１２１に格納さ
せる。これにより、ＰＥ１〜ＰＥ１００の各メモリ１２
１には、それぞれ、「Ｄｉ１が入力される前の１５個の
ダミーデータ」と「入力データＤｉ１〜Ｄｉ８５」が格
納される。At time T42 in FIG. 6, POUTRj
From (j = 1 to 100), the data stored in Qi is passed through PEIN2 of each PE12 (j),
The memory 121 of each PE 12 (j) is stored in the memory 121. As a result, each memory 12 of PE1 to PE100
1 stores "15 dummy data before Di1 is input" and "input data Di1 to Di85", respectively.

【０１０７】図４の時刻Ｔ４５において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれ、レジスタ２１
（ｊ）に格納されているデータを各ＰＥ１２（ｊ）のＰ
ＥＩＮ２を介して、各ＰＥ１２（ｊ）のメモリ１２１に
格納させる。これにより、ＰＥ１２（１）〜ＰＥ１２
（１００）の各メモリ１２１には、それぞれ、「入力デ
ータＤｉ１〜Ｄｉ１００」が格納される。At time T45 in FIG. 4, POUTRj
From (j = 1 to 100), register 21
The data stored in (j) is assigned to P of each PE 12 (j).
It is stored in the memory 121 of each PE 12 (j) via EIN2. Thereby, PE12 (1) to PE12
"Input data Di1 to Di100" are stored in each memory 121 of (100).

【０１０８】図４の時刻Ｔ４８において、ＰＯＵＴＲｊ
（ｊ＝１〜１００）から、それぞれレジスタ２１（ｊ）
に格納されているデータを各ＰＥ１２（ｊ）のＰＥＩＮ
２を介して、各ＰＥ１２（ｊ）のメモリ１２１に格納さ
せる。これにより、ＰＥ１２（１）〜ＰＥ１２（１０
０）の各メモリ１２１には、それぞれ、「入力データＤ
ｉ１６〜Ｄｉ１００」と「Ｄｉ１００が入力された後の
１５個のダミーデータ」が格納される。At time T48 in FIG. 4, POUTRj
From (j = 1 to 100), register 21 (j)
The data stored in the PEIN of each PE12 (j)
2 is stored in the memory 121 of each PE 12 (j). Thereby, PE12 (1) to PE12 (10
0) In each memory 121, "input data D
"i16 to Di100" and "15 dummy data after Di100 is input" are stored.

【０１０９】これらのデータが格納された後に、所望の
演算を行う（後述）。即ち、各ＰＥ１２（ｊ）（ｊ＝１
〜１００）は、時刻Ｔ４２、４５、４８において、受け
取ったデータＤｉｊ−１５、Ｄｉｊ、Ｄｉｊ＋１５を、
メモリ１２１から読みだし、演算回路１２２にて演算を
行い、その演算結果（ＤＯＵＴｉ）をメモリ１２１に再
格納する。この制御は制御回路１０３により行われる。After these data are stored, a desired calculation is performed (described later). That is, each PE 12 (j) (j = 1
˜100), at time T42, 45, 48, the received data Dij-15, Dij, Dij + 15 are
The data is read from the memory 121, the calculation is performed by the calculation circuit 122, and the calculation result (DOUTi) is stored again in the memory 121. This control is performed by the control circuit 103.

【０１１０】演算された後は、従来と同様に、ＤＯＵＴ
１〜ＤＯＵＴ１００が、ＰＳ１０４のパラレルデータ入
力端子ＰＩＮ１〜ＰＩＮ１００を介して、ＰＳ１お４内
のレジスタ１３１（１）〜１３１（１００）に供給され
る。そして、１水平期間（１Ｈ）分の演算結果のデータ
（ＤＯＵＴ１〜ＤＯＵＴ１００）がシリアル化されて、
ＳＯＵＴからワード（画素）シリアルに出力される。After the calculation, DOUT is output as in the conventional case.
1 to DOUT100 are supplied to the registers 131 (1) to 131 (100) in the PS1 and 4 via the parallel data input terminals PIN1 to PIN100 of the PS104. Then, the data (DOUT1 to DOUT100) of the operation result for one horizontal period (1H) is serialized,
Word (pixel) serial output from SOUT.

【０１１１】上述の「所望の演算」について、詳しく述
べる。The above "desired calculation" will be described in detail.

【０１１２】まず、時刻Ｔ４２以降に、各ＰＥ１２
（ｊ）は、時刻Ｔ４２にＳＰＲ１１より受け取ったデー
タＤｉｊ−１５を左隣りに与える。各ＰＥ１２（ｊ−
１）は、このデータＤｉｊ−１５をメモリ１２１に格納
する。First, after time T42, each PE 12
(J) gives the data Dij-15 received from the SPR 11 at time T42 to the left. Each PE12 (j-
1) stores this data Dij-15 in the memory 121.

【０１１３】次に、各ＰＥ１２（ｊ）は、時刻Ｔ４２に
ＳＰＲ１１より受け取ったデータＤｉｊ−１５を右隣り
に与える。各ＰＥ１２（ｊ＋１）は、このデータＤｉｊ
−１５をメモリ１２１に格納する。Next, each PE 12 (j) supplies the data Dij-15 received from the SPR 11 at time T42 to the right adjacent. Each PE 12 (j + 1) has this data Dij
-15 is stored in the memory 121.

【０１１４】この一連の操作により、各ＰＥ１２（ｊ）
には、ＳＰＲ１１より受け取ったデータＤｉｊ−１５
と、隣のプロセッサエレメントから受け取ったデータＤ
ｉｊ−１６、Ｄｉｊ−１４がメモリ１２１に格納されて
いることになる。By this series of operations, each PE 12 (j)
Is the data Dij-15 received from SPR11.
And the data D received from the adjacent processor element
That is, ij-16 and Dij-14 are stored in the memory 121.

【０１１５】次に、時刻Ｔ４５以降に、各ＰＥ１２
（ｊ）は、時刻Ｔ４５にＳＰＲ１１より受け取ったデー
タＤｉｊを左隣りに与える。各ＰＥ１２（ｊ−１）は、
このデータＤｉｊをメモリ１２１に格納する。次に、各
ＰＥ１２（ｊ−１）は、今、メモリ１２１に格納したデ
ータＤｉｊをさらに左隣りに与える。各ＰＥ１２（ｊ−
２）は、このデータＤｉｊをメモリ１２１に格納する。Next, after time T45, each PE 12
In (j), the data Dij received from the SPR 11 at time T45 is provided to the left. Each PE12 (j-1) is
This data Dij is stored in the memory 121. Next, each PE 12 (j-1) further supplies the data Dij stored in the memory 121 to the left side. Each PE12 (j-
2) stores this data Dij in the memory 121.

【０１１６】次に、各ＰＥ１２（ｊ）は、時刻Ｔ４５に
ＳＰＲ１１より受け取ったデータＤｉｊを右隣りに与え
る。各ＰＥ１２（ｊ＋１）は、このデータＤｉｊをメモ
リ１２１に格納する。次に、各ＰＥ１２（ｊ＋１）は、
今、メモリ１２１に格納したデータＤｉｊをさらに右隣
りに与える。各ＰＥ１２（ｊ＋２）は、このデータＤｉ
ｊをメモリ１２１に格納する。Next, each PE 12 (j) supplies the data Dij received from the SPR 11 at time T45 to the right adjacent. Each PE 12 (j + 1) stores this data Dij in the memory 121. Next, each PE 12 (j + 1)
Now, the data Dij stored in the memory 121 is further provided on the right side. Each PE 12 (j + 2) has this data Di
j is stored in the memory 121.

【０１１７】この一連の操作により、各ＰＥ（ｊ）に
は、ＳＰＲ１１より受け取ったデータＤｉｊと、隣のプ
ロセッサエレメントから受け取ったデータＤｉｊ−１、
Ｄｉｊ−２、Ｄｉｊ＋１、Ｄｉｊ＋２がメモリ１２１に
格納されていることになる。Through this series of operations, each PE (j) receives the data Dij received from the SPR11 and the data Dij-1 received from the adjacent processor element,
Dij−2, Dij + 1, and Dij + 2 are stored in the memory 121.

【０１１８】次に、時刻Ｔ４８以降に、各ＰＥ１２
（ｊ）は、時刻Ｔ４８にＳＰＲ１１より受け取ったデー
タＤｉｊ＋１５を左隣りに与える。各ＰＥ１２（ｊ−
１）は、このデータＤｉｊ＋１５をメモリ１２１に格納
する。Next, after the time T48, each PE 12
(J) gives the data Dij + 15 received from the SPR 11 at time T48 to the left. Each PE12 (j-
1) stores the data Dij + 15 in the memory 121.

【０１１９】次に、各ＰＥ１２（ｊ）は、時刻Ｔ４８に
ＳＰＲ１１より受け取ったデータＤｉｊ＋１５を右隣り
に与える。各ＰＥ１２（ｊ＋１）は、このデータＤｉｊ
＋１５をメモリ１２１に格納する。Next, each PE 12 (j) supplies the data Dij + 15 received from the SPR 11 at time T48 to the right adjacent. Each PE 12 (j + 1) has this data Dij
+15 is stored in the memory 121.

【０１２０】この一連の操作により、各ＰＥ１２（ｊ）
には、ＳＰＲ１１より受け取ったデータＤｉｊ＋１５
と、隣のプロセッサエレメントから受け取ったデータＤ
ｉｊ＋１４、Ｄｉｊ＋１６がメモリ１２１に格納されて
いることになる。By this series of operations, each PE 12 (j)
Is the data Dij + 15 received from SPR11.
And the data D received from the adjacent processor element
That is, ij + 14 and Dij + 16 are stored in the memory 121.

【０１２１】以上の操作により、各ＰＥ１２（ｊ）のメ
モリ１２１には、Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉｊ
−１４、Ｄｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋
１、Ｄｉｊ＋２、Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉｊ
＋１６が格納される。By the above operation, Dij-16, Dij-15, Dij are stored in the memory 121 of each PE 12 (j).
-14, Dij-2, Dij-1, Dij, Dij +
1, Dij + 2, Dij + 14, Dij + 15, Dij
+16 is stored.

【０１２２】各ＰＥ１２（ｊ）は、メモリ１２１から、
順次、Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉｊ−１４、Ｄ
ｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋１、Ｄｉｊ＋
２、Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉｊ＋１６を読み
だし、ＰＥ１２（ｊ）内の演算回路１２２に供給し、演
算回路１２２では、これらデータに順にＣＯＥａ０、Ｃ
ＯＥｂ０、ＣＯＥｃ０、ＣＯＥａ１、ＣＯＥｂ１、ＣＯ
Ｅｃ１、ＣＯＥｄ１、ＣＯＥｅ１、ＣＯＥａ２、ＣＯＥ
ｂ２、ＣＯＥｃ２を乗算して、その結果を累積してい
く。最終的に得られた累積結果、を再度メモリ１２１に格納する。Each PE 12 (j) is
Sequentially, Dij-16, Dij-15, Dij-14, D
ij-2, Dij-1, Dij, Dij + 1, Dij +
2, Dij + 14, Dij + 15, Dij + 16 are read out and supplied to the arithmetic circuit 122 in PE12 (j). In the arithmetic circuit 122, COEa0, C
OEb0, COEc0, COEa1, COEb1, CO
Ec1, COEd1, COEe1, COEa2, COE
b2 and COEc2 are multiplied and the result is accumulated. The cumulative result finally obtained, Are again stored in the memory 121.

【０１２３】以上で、「所望の演算」についての説明を
終わる。This is the end of the description of the "desired calculation".

【０１２４】このように、直列型シリアル／パラレル変
換器ＳＰＲ１１を使用して、映像信号（Ｄｉ１〜Ｄｉ１
００）をパラレル化しているので、各時刻毎にＰＯＵＴ
ｊ（ｊ＝１〜１００）から出力されるデータが違ってく
る。上述のように時刻Ｔ４２、Ｔ４５、Ｔ４８という異
なった時刻にデータを各ＰＥ１２（ｊ）に供給すること
で、各ＰＥ１２（ｊ）はデータＤｉｊ−１５、Ｄｉｊ、
Ｄｉｊ＋１５を、ＳＰＲから直接受け取ることが出来
る。As described above, by using the serial type serial / parallel converter SPR11, the video signals (Di1 to Di1) can be obtained.
00) is parallelized, so POUT is changed at each time.
The data output from j (j = 1 to 100) is different. As described above, by supplying data to each PE 12 (j) at different times of time T42, T45, T48, each PE 12 (j) receives data Dij-15, Dij ,.
Dij + 15 can be received directly from the SPR.

【０１２５】従って、従来のように、プロセッサエレメ
ント間でデータの受渡しをあまり行わなくて良いので、
処理時間が短くなる。Therefore, unlike the prior art, it is not necessary to exchange data between processor elements so much.
Processing time is shortened.

【０１２６】勿論、従来の並列型シリアル／パラレル変
換器を用いる場合に比べて、本実施例においては、直列
型シリアル／パラレル変換器を用いているので、消費電
力が大きくなる。もし、消費電力を優先させるのであれ
ば、図４において、ＳＰ１を使用して、ＳＰＲは１１使
用しないようにする。即ち、図４のＳＰ１のＳＩＮから
映像信号（Ｄｉ１〜Ｄｉ１００）を供給し、従来と同様
に、Ｄｉｊを、ＰＯＵＴｊと各ＰＥ１２（ｊ）のＰＥＩ
Ｎ１を介して、各ＰＥ１２（ｊ）のメモリ１２１に与え
る。そして、従来と同様に、プロセッサエレメント間で
データの受渡しを行い、計算していけば良い。Of course, in this embodiment, since the serial type serial / parallel converter is used, the power consumption becomes larger than that in the case of using the conventional parallel type serial / parallel converter. If power consumption is prioritized, SP1 is used and SPR is not used 11 in FIG. That is, a video signal (Di1 to Di100) is supplied from SIN of SP1 of FIG.
It is given to the memory 121 of each PE 12 (j) via N1. Then, as in the conventional case, data may be transferred between the processor elements and the calculation may be performed.

【０１２７】つまり、本実施例においては、演算時間を
優先させるときには、ＳＰＲ１１を使用して、消費電力
を優先させるときは、ＳＰ１を使用する。このように、
従来では、演算時間と消費電力のどちらを優先させるか
選択権がなかったが、本実施例においては選ぶことが出
来る。That is, in this embodiment, the SPR11 is used when the calculation time is prioritized, and the SP1 is used when the power consumption is prioritized. in this way,
Conventionally, there was no right to select which of calculation time and power consumption should be prioritized, but in the present embodiment, it can be selected.

【０１２８】次に第３実施例について説明する。図８及
び図９は本発明の第３実施例に係わり、図８は並列プロ
セッサの構成を示すブロック図、図９は図８の並列プロ
セッサの動作を説明するタイミング図である。第３実施
例は第１実施例とほとんど同じであるので、異なる構成
のみ説明し、同一の構成には同じ符号をつけ説明は省略
する。Next, the third embodiment will be described. 8 and 9 relate to the third embodiment of the present invention, FIG. 8 is a block diagram showing a configuration of a parallel processor, and FIG. 9 is a timing diagram illustrating an operation of the parallel processor of FIG. Since the third embodiment is almost the same as the first embodiment, only different configurations will be described, the same configurations will be denoted by the same reference numerals, and description thereof will be omitted.

【０１２９】例えば、画像データは８ビットであるの
で、図１に示したＳＰ１のレジスタ１１１（ｊ）（ｊ＝
１〜１００）は、それぞれ８ビットのデータを格納でき
るレジスタである。また、スイッチ１１３（ｊ）（ｊ＝
１〜１００）は、それぞれ８ビット幅のスイッチであ
る。そして、ＳＩＮからスイッチ１１３（１）、１１３
（２）、…、１１３（１００）へのバス線は、８ビット
幅である。For example, since the image data has 8 bits, the register 111 (j) of SP1 shown in FIG. 1 (j =
1 to 100) are registers each capable of storing 8-bit data. In addition, the switch 113 (j) (j =
1 to 100) are switches each having an 8-bit width. Then, from SIN, the switches 113 (1), 113
The bus lines to (2), ..., 113 (100) are 8 bits wide.

【０１３０】従来及び上記の各実施例におけるシリアル
／パラレル変換器は、上記したように８ビット幅であっ
たが、本実施例においては、シリアル／パラレル変換器
のビット幅を拡張して、その入力部に遅延回路を用意し
ている。The serial / parallel converters of the prior art and the above-described embodiments have an 8-bit width as described above, but in the present embodiment, the bit width of the serial / parallel converter is expanded to A delay circuit is prepared for the input section.

【０１３１】具体的には、例えば、図８に示すようにシ
リアル／パラレル変換器（ＳＰ）３１のビット幅は２４
ビットであり、その入力部に遅延回路３２、３３を有し
ている。Specifically, for example, as shown in FIG. 8, the serial / parallel converter (SP) 31 has a bit width of 24.
It is a bit and has delay circuits 32 and 33 at its input.

【０１３２】遅延回路３２、３３を用いることにより、
異なったタイミングで同じ入力データをＳＰ３１に入力
できる。例えば、図１９に示したフィルタをかける場合
などは、遅延回路３２、３３により、タイミングを１５
サイクルずつ、ずらした３組の入力データを２４ビット
幅のＳＰ３１のＳＩＮに入力できる。By using the delay circuits 32 and 33,
The same input data can be input to SP31 at different timings. For example, when the filter shown in FIG. 19 is applied, the delay circuits 32 and 33 adjust the timing to 15
It is possible to input three sets of input data that are shifted by each cycle to the SIN of the SP31 having a 24-bit width.

【０１３３】従って、ＰＥ１０２（ｊ）にＤｉｊ−１
５、Ｄｉｊ、Ｄｉｊ＋１５を供給することが出来、従来
のようにプロセッサエレメント間でのデータの受渡しに
時間をかけることがなくなる。Therefore, the PE 102 (j) has Dij-1.
5, Dij, Dij + 15 can be supplied, and it takes no time to transfer data between processor elements as in the conventional case.

【０１３４】さらに、詳しく、図８を説明することにす
る。ＳＰ３１は、従来例である図１２に示したＳＰとほ
とんど同じ構成である。但し、従来例は８ビット幅であ
ったが、本実施例においては、２４ビット幅である。つ
まり、図１１において、レジスタ１１１（ｊ）（ｊ＝１
〜１００）は、それぞれ２４ビットのデータを格納でき
るレジスタである。スイッチ１１３（ｊ）（ｊ＝１〜１
００）は、それぞれ２４ビット幅のスイッチである。そ
して、ＳＩＮからスイッチ１１３（１）、１１３
（２）、…、１１３（１００）へのバス線は、２４ビッ
ト幅である。Further, FIG. 8 will be described in detail. The SP 31 has almost the same configuration as the SP shown in FIG. 12 which is a conventional example. However, the conventional example has a width of 8 bits, but the present embodiment has a width of 24 bits. That is, in FIG. 11, the register 111 (j) (j = 1
.About.100) are registers each capable of storing 24-bit data. Switch 113 (j) (j = 1 to 1
00) are switches each having a width of 24 bits. Then, from SIN, the switches 113 (1), 113
The bus lines to (2), ..., 113 (100) are 24 bits wide.

【０１３５】遅延回路３２、３３は、８ビット幅のデー
タを遅延させる回路である。遅延量は、可変にしてお
き、外部からのコントロールにより、遅延量を変えられ
るようにしておくと、汎用性が出てくる。しかし、ここ
では説明の簡略化のため、それぞれ遅延量は１５サイク
ルに固定しておく。The delay circuits 32 and 33 are circuits for delaying 8-bit wide data. If the delay amount is made variable and the delay amount can be changed by an external control, versatility comes out. However, here, the delay amount is fixed to 15 cycles in order to simplify the description.

【０１３６】ワード（画素）シリアルに供給される入力
データ（８ビット）は、直接にＳＰ３１のＳＩＮの下位
８ビットに入力される。また、入力データ（８ビット）
は、遅延回路３２に入力されており、この遅延回路３２
により１５サイクル遅れたデータが、ＳＰのＳＩＮの中
位８ビットに入力される。さらに、この遅延回路３２か
らのデータ（８ビット）は、遅延回路３３に入力されて
おり、この遅延回路３３により１５サイクル遅れたデー
タが、ＳＰのＳＩＮの上位８ビットに入力される。The input data (8 bits) serially supplied to the word (pixel) is directly input to the lower 8 bits of SIN of SP31. Also, input data (8 bits)
Is input to the delay circuit 32, and the delay circuit 32
Thus, the data delayed by 15 cycles is input to the middle 8 bits of SP SIN. Further, the data (8 bits) from the delay circuit 32 is input to the delay circuit 33, and the data delayed by 15 cycles by the delay circuit 33 is input to the upper 8 bits of SIN of SP.

【０１３７】なお、その他の部分についての説明は、従
来例で述べたものと同様であり、同一記号を付し、その
説明を省略する。The description of the other parts is the same as that described in the conventional example, and the same symbols are attached and the description thereof is omitted.

【０１３８】さて、本発明の実施例である図８の構成
で、図１９に示すフィルタをかける場合の動作説明を以
下に詳しく行う。Now, the operation of the filter shown in FIG. 19 in the configuration of FIG. 8 which is an embodiment of the present invention will be described in detail below.

【０１３９】映像信号であるデータは、図９（ａ）に示
すように、ワード（画素）シリアルに供給される。この
データは、図９（ｂ）に示すように、遅延回路３２によ
り１５サイクル遅延される。Data which is a video signal is supplied in word (pixel) serial as shown in FIG. This data is delayed by 15 cycles by the delay circuit 32, as shown in FIG.

【０１４０】このデータは、さらに遅延回路３３によ
り、図９（ｃ）に示すように、もう１５サイクル遅延さ
れる。This data is further delayed by the delay circuit 33 for another 15 cycles as shown in FIG. 9 (c).

【０１４１】これらデータ（図９（ａ）乃至（ｃ）に示
したデータ）は、それぞれ、ＳＰ３１の下位８ビット、
中位８ビット、上位８ビットに送られる。These data (the data shown in FIGS. 9A to 9C) are respectively the lower 8 bits of SP31,
It is sent to the middle 8 bits and the upper 8 bits.

【０１４２】図９（ｂ）のデータがＳＰに送られる直前
に、図９（ｄ）に示すように、ＳＰ３１のＷＰＴＲか
ら、オン信号を入力する。Immediately before the data of FIG. 9B is sent to the SP, as shown in FIG. 9D, an ON signal is input from the WPTR of SP31.

【０１４３】オン信号が入力された直後、ＳＩＮの下位
には、Ｄｉ１６が入力されていることになる。従って、
１６番目の入力データ（Ｄｉ１６）は、オン信号が遅延
素子１１２（１）に送られていてスイッチ１１３（１）
がオンになるので、レジスタ１１１（１）に格納され
る。次の入力データ（Ｄｉ１７）は、オン信号が遅延素
子１１２（２）に送られていてスイッチ１１３（２）が
オンとなるので、レジスタ１１１（２）に格納される。
以降、同様にして入力データ（Ｄｉ１８〜Ｄｉ１００）
が、レジスタ１１１（３）〜１１１（８５）に格納され
る。そして、レジスタ１１１（８６）〜１１１（１０
０）には未入力である。即ち、Ｄｉ１６〜Ｄｉ１００と
１５個のダミーデータ（不定値）が、それぞれレジスタ
１１１（１）〜１１１（１００）の下位に格納される
（図１２参照）。Immediately after the ON signal is input, Di16 is input below the SIN. Therefore,
In the 16th input data (Di16), the ON signal is sent to the delay element 112 (1) and the switch 113 (1)
Is turned on, and is stored in the register 111 (1). The next input data (Di17) is stored in the register 111 (2) because the ON signal is sent to the delay element 112 (2) and the switch 113 (2) is turned on.
After that, input data (Di18 to Di100) is similarly processed.
Are stored in the registers 111 (3) to 111 (85). Then, the registers 111 (86) to 111 (10
0) is not input. That is, Di16 to Di100 and 15 pieces of dummy data (indeterminate values) are stored in the lower order of the registers 111 (1) to 111 (100), respectively (see FIG. 12).

【０１４４】また、オン信号が入力された直後、ＳＩＮ
の中位には、Ｄｉ１が入力されていることになる。従っ
て、１番目の入力データ（Ｄｉ１）は、オン信号が遅延
素子１１２（１）に送られていてスイッチ１１３（１）
がオンになるので、レジスタ１１１（１）に格納され
る。次の入力データ（Ｄｉ２）は、オン信号が遅延素子
１１２（２）に送られていてスイッチ１１３（２）がオ
ンとなるので、レジスタ１１１（２）に格納される。以
降、同様にして入力データ（Ｄｉ３〜Ｄｉ１００）が、
レジスタ１１１（３）〜１１１（１００）に格納され
る。即ち、Ｄｉ１〜Ｄｉ１００が、それぞれレジスタ１
１１（１）〜１１１（１００）の中位に格納される。Immediately after the ON signal is input, the SIN
It means that Di1 is input to the middle position. Therefore, in the first input data (Di1), the ON signal is sent to the delay element 112 (1) and the switch 113 (1)
Is turned on, and is stored in the register 111 (1). The next input data (Di2) is stored in the register 111 (2) because the ON signal is sent to the delay element 112 (2) and the switch 113 (2) is turned on. Thereafter, similarly, the input data (Di3 to Di100) is
It is stored in the registers 111 (3) to 111 (100). That is, Di1 to Di100 are register 1
It is stored in the middle of 11 (1) to 111 (100).

【０１４５】また、ＳＩＮの上位に１番目の入力データ
（Ｄｉ１）が入力される時、オン信号が遅延素子１１２
（１６）まで送られているのでスイッチ１１３（１６）
がオンになり、レジスタ１１１（１６）に格納される。
次の入力データ（Ｄｉ２）は、オン信号が遅延素子１１
２（１７）に送られていてスイッチ１１３（１７）がオ
ンとなるので、レジスタ１１１（１７）に格納される。
以降、同様にして入力データ（Ｄｉ３〜Ｄｉ８５）が、
レジスタ１１１（１８）〜１１１（１００）に格納され
る。即ち、Ｄｉ１が入力される前の１５個のダミーデー
タとＤｉ１〜Ｄｉ８５のデータが、それぞれレジスタ１
１１（１）〜１１１（１００）の上位に格納される。Also, when the first input data (Di1) is input to the upper part of SIN, the ON signal changes to the delay element 112.
Since it has been sent to (16), switch 113 (16)
Is turned on and stored in the register 111 (16).
For the next input data (Di2), the ON signal is the delay element 11
2 (17) and the switch 113 (17) is turned on, and is stored in the register 111 (17).
Thereafter, similarly, the input data (Di3 to Di85) is
It is stored in the registers 111 (18) to 111 (100). That is, the 15 dummy data before Di1 is input and the data of Di1 to Di85 are respectively stored in the register 1
It is stored in the higher order of 11 (1) to 111 (100).

【０１４６】このようにして、図９の時刻Ｔ５１の間に
入力されてくるデータが、図９（ｅ）に示すように、レ
ジスタ１１１（１）〜１１１（１００）に上位、中位、
下位に格納される。即ち、レジスタ１１１（ｊ）（ｊ＝
１〜１００）の上位、中位、下位には、Ｄｉｊ＋１５、
Ｄｉｊ、Ｄｉｊ−１５が格納されている。In this way, the data input during the time T51 in FIG. 9 is transferred to the registers 111 (1) to 111 (100) in the high-order, middle-order, and high-order as shown in FIG. 9 (e).
It is stored in the lower order. That is, the register 111 (j) (j =
1-100), the upper, middle, and lower are Dij + 15,
Dij and Dij-15 are stored.

【０１４７】時刻Ｔ５１の直後に、図９（ｅ）に示すデ
ータがレジスタ１１１（１）〜１１１（１００）に格納
されているので、ＳＰ３１のＰＯＵＴ１〜ＰＯＵＴ１０
０を介して、ＰＥ１０２（１）〜ＰＥ１０２（１００）
のメモリ１２１に格納する。Immediately after time T51, the data shown in FIG. 9 (e) is stored in the registers 111 (1) to 111 (100), so that POUT1 to POUT10 of SP31 are stored.
0 through PE 102 (1) to PE 102 (100)
Stored in the memory 121.

【０１４８】これらのデータが格納された後に、所望の
演算を行う（後述）。即ち、各ＰＥ１０２（ｊ）（ｊ＝
１〜１００）は、時刻Ｔ５１直後において、受け取った
データＤｉｊ−１５、Ｄｉｊ、Ｄｉｊ＋１５を、メモリ
１２１から読みだし、演算回路１２２にて演算を行い、
その演算結果（ＤＯＵＴｊ）をメモリ１２１に再格納す
る。この制御は制御回路１０３により行われる。After these data are stored, a desired calculation is performed (described later). That is, each PE 102 (j) (j =
Immediately after time T51, the received data Dij-15, Dij, and Dij + 15 are read from the memory 121, and the arithmetic circuit 122 performs arithmetic operation.
The calculation result (DOUTj) is stored again in the memory 121. This control is performed by the control circuit 103.

【０１４９】演算された後は、従来と同様に、ＤＯＵＴ
１〜ＤＯＵＴ１００が、ＰＳ１０４のパラレルデータ入
力端子ＰＩＮ１〜ＰＩＮ１００を介して、ＰＳ１０４内
のレジスタ１３１（１）〜１３１（１００）に供給され
る。そして、１水平期間（１Ｈ）分の演算結果のデータ
（ＤＯＵＴ１〜ＤＯＵＴ１００）がシリアル化されて、
ＳＯＵＴからワード（画素）シリアルに出力される。After the calculation, DOUT is output as in the conventional case.
1 to DOUT100 are supplied to the registers 131 (1) to 131 (100) in the PS104 via the parallel data input terminals PIN1 to PIN100 of the PS104. Then, the data (DOUT1 to DOUT100) of the operation result for one horizontal period (1H) is serialized,
Word (pixel) serial output from SOUT.

【０１５０】上述の「所望の演算」について、詳しく述
べる。The above "desired calculation" will be described in detail.

【０１５１】まず、ＳＰ３１からＰＯＵＴｊを介してデ
ータを受け取った後、各ＰＥ１０２（ｊ）は、ＳＰ３１
より受け取ったデータＤｉｊ−１５を左隣りに与える。
各ＰＥ１０２（ｊ−１）は、このデータＤｉｊ−１５を
メモリ１２１に格納する。First, after receiving data from SP31 via POUTj, each PE 102 (j)
The received data Dij-15 is given to the left.
Each PE 102 (j-1) stores this data Dij-15 in the memory 121.

【０１５２】次に、各ＰＥ１０２（ｊ）は、ＳＰ３１よ
り受け取ったデータＤｉｊ−１５を右隣りに与える。各
ＰＥ１０２（ｊ＋１）は、このデータＤｉｊ−１５をメ
モリ１２１に格納する。Next, each PE 102 (j) supplies the data Dij-15 received from SP31 to the right adjacent. Each PE 102 (j + 1) stores this data Dij-15 in the memory 121.

【０１５３】この一連の操作により、各ＰＥ１０２
（ｊ）には、ＳＰ３１より受け取ったデータＤｉｊ−１
５と、隣のプロセッサエレメントから受け取ったデータ
Ｄｉｊ−１６、Ｄｉｊ−１４がメモリ１２１に格納され
ていることになる。By this series of operations, each PE 102 is
(J) shows the data Dij-1 received from SP31.
5 and the data Dij-16 and Dij-14 received from the adjacent processor element are stored in the memory 121.

【０１５４】次に、各ＰＥ１０２（ｊ）は、ＳＰ３１よ
り受け取ったデータＤｉｊを左隣りに与える。各ＰＥ１
０２（ｊ−１）は、このデータＤｉｊをメモリ１２１に
格納する。次に、各ＰＥ１０２（ｊ−１）は、今、メモ
リ１２１に格納したデータＤｉｊをさらに左隣りに与え
る。各ＰＥｉ−２は、このデータＤｉｊをメモリ１２１
に格納する。Next, each PE 102 (j) supplies the data Dij received from SP31 to the left adjacent. Each PE1
02 (j-1) stores the data Dij in the memory 121. Next, each PE 102 (j-1) further supplies the data Dij stored in the memory 121 to the left side. Each PEi-2 stores this data Dij in the memory 121.
To be stored.

【０１５５】次に、各ＰＥ１０２（ｊ）は、ＳＰ３１よ
り受け取ったデータＤｉｊを右隣りに与える。各ＰＥ１
０２（ｊ＋１）は、このデータＤｉｊをメモリ１２１に
格納する。次に、各ＰＥ１０２（ｊ＋１）は、今、メモ
リ１２１に格納したデータＤｉｊをさらに右隣りに与え
る。各ＰＥ１０２（ｊ＋２）は、このデータＤｉｊをメ
モリ１２１に格納する。Next, each PE 102 (j) supplies the data Dij received from SP31 to the right adjacent. Each PE1
02 (j + 1) stores this data Dij in the memory 121. Next, each PE 102 (j + 1) further supplies the data Dij stored in the memory 121 to the right next. Each PE 102 (j + 2) stores this data Dij in the memory 121.

【０１５６】この一連の操作により、各ＰＥ１０２
（ｊ）には、ＳＰ３１より受け取ったデータＤｉｊと、
隣のプロセッサエレメントから受け取ったデータＤｉｊ
−１、Ｄｉｊ−２、Ｄｉｊ＋１、Ｄｉｊ＋２がメモリ１
２１に格納されていることになる。By this series of operations, each PE 102 is
In (j), the data Dij received from SP31,
Data Dij received from the adjacent processor element
-1, Dij-2, Dij + 1, Dij + 2 are the memory 1
It is stored in 21.

【０１５７】次に、各ＰＥ１０２（ｊ）は、ＳＰ３１よ
り受け取ったデータＤｉｊ＋１５を左隣りに与える。各
ＰＥ１０２（ｊ−１）は、このデータＤｉｊ＋１５をメ
モリ１２１に格納する。Next, each PE 102 (j) supplies the data Dij + 15 received from SP31 to the left side. Each PE 102 (j-1) stores this data Dij + 15 in the memory 121.

【０１５８】次に、各ＰＥ１０２（ｊ）は、ＳＰ３１よ
り受け取ったデータＤｉｊ＋１５を右隣りに与える。各
ＰＥ１０２（ｊ＋１）は、このデータＤｉｊ＋１５をメ
モリ１２１に格納する。Next, each PE 102 (j) supplies the data Dij + 15 received from SP31 to the right side. Each PE 102 (j + 1) stores this data Dij + 15 in the memory 121.

【０１５９】この一連の操作により、各ＰＥ１０２
（ｊ）には、ＳＰ３１より受け取ったデータＤｉｊ＋１
５と、隣のプロセッサエレメントから受け取ったデータ
Ｄｉｊ＋１４、Ｄｉｊ＋１６がメモリ１２１に格納され
ていることになる。By this series of operations, each PE 102
In (j), the data Dij + 1 received from SP31
5 and the data Dij + 14 and Dij + 16 received from the adjacent processor element are stored in the memory 121.

【０１６０】以上の操作により、各ＰＥ１０２（ｊ）の
メモリ１２１には、Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉ
ｊ−１４、Ｄｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋
１、Ｄｉｊ＋２、Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉｊ
＋１６が格納される。By the above operation, Dij-16, Dij-15, Di are stored in the memory 121 of each PE 102 (j).
j-14, Dij-2, Dij-1, Dij, Dij +
1, Dij + 2, Dij + 14, Dij + 15, Dij
+16 is stored.

【０１６１】各ＰＥ１０２（ｊ）は、メモリ１２１か
ら、順次、Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉｊ−１
４、Ｄｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋１、Ｄ
ｉｊ＋２、Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉｊ＋１６
を読みだし、ＰＥ１０２（ｊ）内の演算回路１２２に供
給し、演算回路１２２では、これらデータに順にＣＯＥ
ａ０、ＣＯＥｂ０、ＣＯＥｃ０、ＣＯＥａ１、ＣＯＥｂ
１、ＣＯＥｃ１、ＣＯＥｄ１、ＣＯＥｅ１、ＣＯＥａ
２、ＣＯＥｂ２、ＣＯＥｃ２を乗算して、その結果を累
積していく。最終的に得られた累積結果、を再度メモリ１２１に格納する。Each PE 102 (j) sequentially receives from the memory 121, Dij-16, Dij-15, Dij-1.
4, Dij-2, Dij-1, Dij, Dij + 1, D
ij + 2, Dij + 14, Dij + 15, Dij + 16
Is read out and supplied to the arithmetic circuit 122 in the PE 102 (j).
a0, COEb0, COEc0, COEa1, COEb
1, COEc1, COEd1, COEe1, COEa
2, COEb2, COEc2 are multiplied and the results are accumulated. The cumulative result finally obtained, Are again stored in the memory 121.

【０１６２】以上で、「所望の演算」についての説明を
終わる。This is the end of the description of the "desired calculation".

【０１６３】従来では、時間的に充分離れて入力されて
くるデータ（「Ｄｉｊ−１６、Ｄｉｊ−１５、Ｄｉｊ−
１４」と「Ｄｉｊ−２、Ｄｉｊ−１、Ｄｉｊ、Ｄｉｊ＋
１Ｄｉｊ＋２」と「Ｄｉｊ＋１４、Ｄｉｊ＋１５、Ｄｉ
ｊ＋１６」）を使用して演算する場合でも、隣り合うプ
ロセッサエレメントとデータの受渡しを行うことによ
り、必要なデータを各プロセッサエレメントに持ってこ
なくてはいけなかった。しかし、本実施例では、時間的
に充分離れて入力されてくるデータの場合、その離れて
いる分だけディレイ回路を使って遅延させてシリアル／
パラレル変換器に入力しており、隣り合うプロセッサエ
レメントとデータの受渡しを行うことなく、必要なデー
タを各プロセッサエレメントあるいはその近傍のプロセ
ッサエレメントに持ってくることが出来るので、データ
の受渡し時間を短くすることが出来る。Conventionally, data (“Dij-16, Dij-15, Dij-
14 "and" Dij-2, Dij-1, Dij, Dij + "
1Dij + 2 ”and“ Dij + 14, Dij + 15, Di
j + 16 "), it is necessary to bring necessary data to each processor element by exchanging data with adjacent processor elements. However, in the present embodiment, in the case of data that is input with a sufficient time separation, serial / serial data is delayed by the amount of the separation.
Since the data is input to the parallel converter and the necessary data can be brought to each processor element or the processor elements in the vicinity without transferring the data to and from the adjacent processor elements, the data transfer time can be shortened. You can do it.

【０１６４】従って、１水平水平期間の間に計算を終わ
らせることが出来る。Therefore, the calculation can be completed within one horizontal horizontal period.

【０１６５】[0165]

【発明の効果】以上説明したように請求項１の並列プロ
セッサによれば、複数のシリアル／パラレル変換手段に
よりシリアル／パラレル変換のタイミングをずらし、複
数の演算手段により演算処理を行うので、ゴーストキャ
ンセラなどで使用されるような係数値が０でないものが
グループを成すフィルタとの乗算を効率的に行うことが
できるという効果がある。As described above, according to the parallel processor of the first aspect, since the timing of serial / parallel conversion is shifted by the plurality of serial / parallel conversion means and the arithmetic processing is performed by the plurality of arithmetic means, the ghost canceller. There is an effect that a filter having a coefficient value other than 0, which is used for example, can be efficiently multiplied with a filter forming a group.

【０１６６】請求項４の並列プロセッサによれば、複数
のレジスタを直列に接続したシリアル／パラレル変換手
段が、レジスタの初段に第１のデータを入力し、各レジ
スタ出力からパラレル化された第１のデータを取り出し
て、シリアル／パラレル変換のタイミングをずらし、演
算手段により演算処理を行うので、ゴーストキャンセラ
などで使用されるような係数値が０でないものがグルー
プを成すフィルタとの乗算を効率的に行うことができる
という効果がある。According to the parallel processor of claim 4, the serial / parallel conversion means in which a plurality of registers are connected in series inputs the first data to the first stage of the registers, and the first output is parallelized from the output of each register. Data is taken out, the timing of serial / parallel conversion is shifted, and the arithmetic processing is performed by the arithmetic means, so that multiplication with a filter whose coefficient value is not 0, which is used in a ghost canceller, forms a group efficiently. There is an effect that can be done.

【０１６７】請求項５の並列プロセッサによれば、デー
タ遅延手段により時間的に充分離れて入力されてくるデ
ータを遅延させてシリアル／パラレル変換し、複数の演
算手段により演算処理を行うので、ゴーストキャンセラ
などで使用されるような係数値が０でないものがグルー
プを成すフィルタとの乗算を効率的に行うことができる
という効果がある。According to the parallel processor of the fifth aspect, the data delay means delays the data input with a sufficient time separation to perform serial / parallel conversion, and the arithmetic processing is performed by the plurality of arithmetic means. There is an effect that a filter having a coefficient value other than 0 as used in a canceller or the like can efficiently perform multiplication with a filter forming a group.

[Brief description of drawings]

【図１】並列プロセッサの本発明の第１実施例の構成を
示すブロック図である。FIG. 1 is a block diagram showing the configuration of a first embodiment of a parallel processor according to the present invention.

【図２】図１のプロセッサエレメントの構成を示すブロ
ック図である。FIG. 2 is a block diagram showing a configuration of a processor element of FIG.

【図３】図１の並列プロセッサの動作を説明するタイミ
ングチャートである。FIG. 3 is a timing chart illustrating the operation of the parallel processor of FIG.

【図４】並列プロセッサの本発明の第２実施例の構成を
示すブロック図である。FIG. 4 is a block diagram showing a configuration of a second embodiment of the present invention of a parallel processor.

【図５】図４のＳＰＲの構成を示すブロック図である。5 is a block diagram showing the configuration of the SPR of FIG.

【図６】図５のＳＰＲの動作を説明するタイミング図で
ある。6 is a timing diagram illustrating an operation of SPR of FIG.

【図７】図４のプロセッサエレメントの構成を示すブロ
ック図である。FIG. 7 is a block diagram showing a configuration of a processor element of FIG.

【図８】並列プロセッサの本発明の第３実施例の構成を
示すブロック図である。FIG. 8 is a block diagram showing a configuration of a parallel processor according to a third exemplary embodiment of the present invention.

【図９】図８の並列プロセッサの動作を説明するタイミ
ング図である。9 is a timing diagram illustrating the operation of the parallel processor of FIG.

【図１０】従来の並列プロセッサの構成を示すブロック
図である。FIG. 10 is a block diagram showing a configuration of a conventional parallel processor.

【図１１】図１０のＳＰの構成を示すブロック図であ
る。11 is a block diagram showing the configuration of the SP shown in FIG.

【図１２】図１１のＳＰの動作を説明するタイミング図
である。12 is a timing diagram illustrating the operation of the SP shown in FIG.

【図１３】図１０のプロセッサエレメントの構成を示す
ブロック図である。13 is a block diagram showing a configuration of a processor element of FIG.

【図１４】図１０のＰＳの構成を示すブロック図であ
る。14 is a block diagram showing a configuration of PS of FIG.

【図１５】図１４のＰＳの動作を説明するタイミング図
である。FIG. 15 is a timing diagram illustrating the operation of the PS of FIG.

【図１６】図１０の並列プロセッサの動作を説明するタ
イミング図である。16 is a timing diagram illustrating the operation of the parallel processor of FIG.

【図１７】図１０の並列プロセッサで処理されるデータ
を説明する説明図である。17 is an explanatory diagram illustrating data processed by the parallel processor in FIG.

【図１８】図１０の並列プロセッサで行うフィルタ処理
の第１の例を説明する説明図である。18 is an explanatory diagram illustrating a first example of filter processing performed by the parallel processor in FIG.

【図１９】図１０の並列プロセッサで行うフィルタ処理
の第２の例を説明する説明図である。FIG. 19 is an explanatory diagram illustrating a second example of filter processing performed by the parallel processor in FIG. 10.

[Explanation of symbols]

１，２，３ＳＰ４（１）〜４（１００）ＰＥ１１ＳＰＲ１２（１）〜１２（１００）ＰＥ２１（１）〜２１（１００）レジスタ３１ＳＰ３２，３３遅延回路１０１ＳＰ１０２（１）〜１０２（１００）ＰＥ１０３制御回路１０４ＰＳ１１１（１）〜１１１（１００）レジスタ１１２（１）〜１１２（１００）遅延素子１１３（１）〜１１３（１００）スイッチ１２１メモリ１２２演算回路１３１（１）〜１３１（１００）レジスタ１３２（１）〜１３２（１００）遅延素子１３３（１）〜１３３（１００）スイッチ 1,2,3 SP 4 (1) to 4 (100) PE 11 SPR 12 (1) to 12 (100) PE 21 (1) to 21 (100) Register 31 SP 32,33 Delay circuit 101 SP 102 (1) ) -102 (100) PE 103 control circuit 104 PS 111 (1) -111 (100) register 112 (1) -112 (100) delay element 113 (1) -113 (100) switch 121 memory 122 arithmetic circuit 131 ( 1) to 131 (100) register 132 (1) to 132 (100) delay element 133 (1) to 133 (100) switch

Claims

[Claims]

1. A plurality of serial / parallel conversion means for converting a plurality of serially input first data into parallel data, and a plurality of parallelized first data from the plurality of serial / parallel conversion means. A plurality of arithmetic means for inputting in parallel and performing arithmetic processing, and parallel / serial for inputting in parallel the second data generated by the arithmetic operation by the arithmetic means, serially converting the second data, and outputting serially. A parallel processor, comprising: a conversion unit.

2. The plurality of serial / parallel conversion means inputs the first data at the same timing, and performs parallel conversion by shifting the timing between the plurality of serial / parallel conversion means. The parallel processor according to claim 1.

3. The parallel processor according to claim 1, wherein the first data is a video signal, and the operation performed by the operation means is a filter process for a ghost canceller. .

4. A serial / parallel conversion means for converting a plurality of first data input serially into parallel, and the parallelized first data from the serial / parallel conversion means being input in parallel. A plurality of arithmetic means for performing arithmetic processing, and a parallel / serial conversion means for inputting in parallel the second data calculated and generated by the arithmetic means, converting the second data serially, and outputting serially. In the parallel processor provided, there are a plurality of the serial / parallel converters, and at least one of the plurality of serial / parallel converters has a register connected in series, and the first data is provided in the first stage of the register. A parallel processor, characterized in that the first data is input and the parallelized first data is taken out from each register output.

5. A data delay means for delaying a plurality of first data inputted serially, a serial / parallel conversion means for parallel-converting the delayed first data from the delay means, The first data parallelized by the serial / parallel conversion means is input in parallel, and a plurality of arithmetic means for performing arithmetic processing, and the second data generated by the arithmetic operation by the arithmetic means are input in parallel. A parallel processor, comprising: parallel / serial conversion means for converting the second data to serial and outputting serially.

6. The delay means delays a time of the first data, which is not necessary for the arithmetic processing in the arithmetic means, to be input to the delay means. Item 5. A parallel processor according to item 5.