JPH0583943B2

JPH0583943B2 -

Info

Publication number: JPH0583943B2
Application number: JP2059859A
Authority: JP
Inventors: Tokuyasu Imon; Mamoru Sugie; Toshiaki Tarui; Hiromitsu Maeda
Original assignee: Agency of Industrial Science and Technology
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 1990-03-13
Filing date: 1990-03-13
Publication date: 1993-11-30
Also published as: JPH03262071A

Description

[Detailed description of the invention] [Industrial application field]

本発明は、複数の独立に動作可能なプロセシン
グ・エレメントと、これらの間での通信を可能に
するネツトワークから成る並列計算機に関する。 The present invention relates to a parallel computer comprising a plurality of independently operable processing elements and a network that enables communication between them.

[Conventional technology]

従来の並列計算機では、特開昭62−274451号公
報（以下、第１の従来技術と呼ぶ）に記載のよう
に、送信側のプロセツサから出力したデータはそ
の出力順に相互結合網に送出され、また、相互結
合網から到着したデータは、その受信順に受信側
プロセツサに取り込まれる。すなわち、送信元の
プロセツサから相互結合網へのデータの転送は、
データがそのプロセツサから送信データレジスタ
に書き込まれた後、送信データレジスタから送信
レジスタへと転送され、更にその後、送信レジス
タから相互結合網へ転送されることによつて行わ
れていた。また、相互結合網から送信先プロセツ
サへのデータの転送は、相互結合網から受信デー
タレジスタへデータを転送し、さらに受信データ
レジスタから受け取りデータレジスタへと転送さ
れ、更にその後、受け取りデータレジスタからプ
ロセツサへ転送されることによつて行われてい
た。並列計算機ではないが、多数のプロセツサが分
散して配置されているシステムにおいては、特開
昭58−40952号公報（以下、第２の従来技術と呼
ぶ）に記載のように、通信回線へ送出するとき
に、優先制御をすることが知られている。すなわ
ち、送信元プロセツサから回線へのデータ送信に
おいて、優先度の高い優先データと優先度の低い
非優先データをそれぞれ第１のメモリと第２のメ
モリに蓄え、優先データは所定数まで連続して第
３のメモリに出力し、非優先データは所定条件が
満された時に、一つのデータのみ第３のメモリに
出力しておき、回線が使用できるようになつた時
に、第３のメモリのデータを回線に出力するよう
になつていた。 In a conventional parallel computer, as described in Japanese Patent Application Laid-Open No. 62-274451 (hereinafter referred to as the first conventional technology), data output from a transmitter processor is sent to an interconnection network in the order in which it is output. Furthermore, data arriving from the interconnection network is taken into the receiving processor in the order in which it is received. In other words, the data transfer from the source processor to the interconnection network is
This was done by writing data from the processor to the transmit data register, then transferring it from the transmit data register to the transmit register, and then from the transmit register to the interconnect network. Furthermore, data is transferred from the interconnection network to the destination processor by transferring the data from the interconnection network to the reception data register, from the reception data register to the reception data register, and then from the reception data register to the processor. This was done by being transferred to. Although it is not a parallel computer, in a system in which a large number of processors are arranged in a distributed manner, as described in Japanese Patent Application Laid-Open No. 58-40952 (hereinafter referred to as the second prior art), it is necessary to send data to a communication line. It is known that priority control is used when In other words, when transmitting data from the source processor to the line, high-priority priority data and low-priority non-priority data are stored in the first memory and second memory, respectively, and the priority data is stored consecutively up to a predetermined number. Only one piece of data is output to the third memory when a predetermined condition is met for non-priority data, and when the line becomes available, the data in the third memory is output to the third memory. was output to the line.

[Problem to be solved by the invention]

上記第１の従来技術では至急転送を要するデー
タとそうでないデータを区別して転送していな
い。さらに、１つのデータの転送が終了しないと
次に転送すべきデータをプロセツサから出力する
ことが出来ない。このため至急転送が望ましいデ
ータを他より優先して転送することはできない。また、第２の従来技術では、送信すべきデータ
に優先データと非優先データの区別があるが、受
信側プロセツサでは、回線から転送された順番で
しかデータを処理することが出来ない。並列計算機ではクロスバー構成等のネツトワー
クがよく用いられる。この場合には、一つの受信
側プロセツサに送られてくるデータの送信元は、
一つでない。このため、送信元での優先データが
早くネツトワークに出力されても、受信側のプロ
セツサは、これを優先して受け取ることが出来な
い。並列計算機においては、プロセツサ間で負荷分
散を行う場合、他のプロセツサの負荷を至急に知
ることが必要である。この様な場合、上記従来技
術では、他のプロセツサへの負荷の値の問い合わ
せと、そこからの返信が遅れる恐れがある。この
遅れの間に問い合わせ先のプロセツサの負荷の変
化が生じ、そのプロセツサの正しい負荷の値を知
ることができず、動的に望ましい負荷分散を行う
ことが出来なかつた。本発明の目的は、至急を要するデータの転送を
より速やかに実行可能な並列計算機を提供するこ
とにある。 In the first conventional technique, data that requires urgent transfer and data that does not need to be transferred are not distinguished and transferred. Furthermore, unless the transfer of one data is completed, the next data to be transferred cannot be output from the processor. For this reason, it is not possible to transfer data that is desired to be transferred as soon as possible with priority over other data. Furthermore, in the second prior art, there is a distinction between priority data and non-priority data in the data to be transmitted, but the receiving processor can process the data only in the order in which it is transferred from the line. Networks such as a crossbar configuration are often used in parallel computers. In this case, the source of the data sent to one receiving processor is
Not one. For this reason, even if the priority data at the source is output to the network early, the processor on the receiving side cannot receive it with priority. In a parallel computer, when distributing the load between processors, it is necessary to know the load of other processors as soon as possible. In such a case, with the above-mentioned conventional technology, there is a possibility that the inquiry of the load value to other processors and the reply from there may be delayed. During this delay, a change in the load of the processor to be queried occurred, making it impossible to know the correct load value for that processor, making it impossible to dynamically perform desirable load distribution. An object of the present invention is to provide a parallel computer that can more quickly transfer urgent data.

[Means to solve the problem]

上記目的を達成するために、本発明ではプロセ
シング・エレメント内のプロセツサから出力され
たメツセージが至急転送を要するメツセージかど
うかをメツセージ識別子により判定し、至急転送
を要するメツセージの場合は、他のプロセシン
グ・エレメントに転送するメツセージを蓄える手
段に蓄えられている送信待ちのメツセージに優先
して、至急転送を要するメツセージをネツトワー
クに次に送信すべきメツセージとし選択する手段
を設けた。さらに、ネツトワークから転送されて来たメツ
セージが至急転送を要するメツセージかどうかを
メツセージ識別子により判定し、至急転送を要す
るメツセージの場合は、他のプロセシング・エレ
メントから転送されたメツセージを蓄える手段に
蓄えられている処理待ちのメツセージに優先し
て、至急転送を要するメツセージを次にプロセツ
サにより処理されるべきメツセージとして選択す
る手段を設けた。 In order to achieve the above object, the present invention uses a message identifier to determine whether a message output from a processor in a processing element requires urgent forwarding, and if the message requires urgent forwarding, another processing A means is provided for selecting a message that requires urgent transfer as a message to be transmitted next to a network, giving priority to messages waiting to be transmitted stored in a means for storing messages to be transferred to an element. Furthermore, it is determined based on the message identifier whether a message transferred from the network requires immediate forwarding, and if the message requires urgent forwarding, it is stored in a means for storing messages forwarded from other processing elements. Means is provided for selecting a message that requires urgent forwarding as the next message to be processed by the processor, giving priority to messages waiting to be processed.

[Effect]

プロセシング・エレメント内のプロセツサから
出力されたメツセージは、至急転送を要するメツ
セージかどうかを判定する手段によつて判定さ
れ、判定の結果至急転送を要するメツセージの場
合には、他のプロセシング・エレメントに転送す
るメツセージを蓄える手段にすでに蓄えられてい
る送信待ちのメツセージに優先してネツトワーク
に送出されるため、ネツトワークに送出されるま
での待ち時間が短い。また、ネツトワークから新たに受信されたメツ
セージは、至急転送を要するメツセージかどうか
を判定する手段によつて判定され、判定の結果が
至急転送を要するメツセージの場合には、他のプ
ロセシング・エレメントから転送されたメツセー
ジを蓄える手段にすでに蓄えられている処理まち
のメツセージに優先して、プロセツサに入力され
るため、至急を要するメツセージの転送が速やか
に行なわれる。 A message output from a processor in a processing element is determined by means for determining whether the message requires urgent forwarding, and if the message requires urgent forwarding as a result of the determination, it is forwarded to another processing element. The waiting time until the message is sent to the network is short because it is sent to the network with priority over messages that are already stored in the storage means and are waiting to be sent. In addition, a message newly received from the network is judged by a means for judging whether or not the message requires urgent forwarding, and if the result of the judgment is that the message needs to be forwarded immediately, the message is sent from another processing element. Since messages that are already stored in the transferred message storage means and are being input to the processor are given priority to be processed, urgent messages can be transferred quickly.

【Example】

以下、本発明の一実施例を第１図、第２図を用
いて説明する。第１図には、本発明を実施した並列計算機の一
例を示す。図中、PE₁〜PE_oはプロセシング・エ
レメント、NETはネツトワーク、CPU₁〜CPU_o
はプロセシング・エレメントPE₁〜PE_o内のプロ
セツサである。OD₁ないしOD_oは出力レジスタ、
OB₁ないしOB_oは出力バツフア、SO₁ないしSO_o
は出力セレクタ、OR₁ないしOR_oは受信レジス
タ、IB₁ないしIB_oは入力バツフア、SI₁ないしSI_o
は入力セレクタ、ID₁ないしID_oは入力レジスタ
である。 ODEC₁ないしODEC_oは出力識別デコーダであ
り、IDEC₁ないしIDEC_oは入力識別デコーダであ
る。LOR₁ないしLOR_oは、論理和回路である。続いて、第１図の各構成要素の機能について説
明する。NETは、メツセージ内で指定されたプ
ロセシング・エレメントにメツセージを転送する
ネツトワークであり、例えば、クロスバー、オメ
ガ等のネツトワークである。C5−ｌないしC5−
ｎは、ネツトワークNETにメツセージを送信し
ようとするプロセシング・エレメントに対して、
メツセージの送信の抑止を指示する信号で、ネツ
トワークNETから出力される。これらの信号は、
以前にネツトワークが受け取つたメツセージの転
送が、ネツトワークNETの閉塞のために完了し
ていない時に‘1'、そうでない時に‘0'になる。
プロセシング・エレメントPE₁なしいPE_oは同じ
構成要素を有するため以下、プロセシング・エレ
メントPE₁を例に説明する。プロセツサCPU₁は、
例えば、日立製作所のマイクロプロセツサＨ３２
にメモリのその他の周辺機器を接続したものから
なる。OD₁は、CPU₁が他のプロセシング・エレ
メントに転送すべきメツセージを線ｌ１−１から
受け取る出力レジスタである。第２図のこのメツ
セージの形式を示す。メツセージは、メツセージ
の転送先のプロセシング・エレメントの番号を示
すPE番号フイールドf_pと、メツセージが至急転
送を要するメツセージかどうかを示す識別子フイ
ールドf_iと、メツセージのデータフイールドf_dか
らなる。このメツセージの形式は、出力レジスタ
OD₁、出力バツフアOB₁、送信レジスタOR₁、受
信レジスタIR₁、入力バツフアIB₁、入力レジス
タID₁に共通な形式である。CPU₁が出力レジス
タOD₁にメツセージを書き込む時、抑止信号C5
−１が‘1'の場合、書き込もうとしているメツセ
ージが至急転送を要するメツセージのときでも、
抑止信号C5−１が‘0'になるまでCPU₁は書き込
みを保留する。尚、識別子フイールドf_iの値は、
CPU₁で、メツセージ作成時に、プログラムで決
められる。出力識別デコーダODEC₁は、出力レ
ジスタOD₁に入力されたメツセージの識別子フイ
ールドf_iをデコードし、至急転送を要するメツセ
ージの場合、線Ｃ１−１に‘1'、線Ｃ２−１に‘
0'を出力し、そうでない場合、線Ｃ１−１に‘
0'、線Ｃ２−１に‘1'を出力する。LOR₁は論理
和回路であり、線１−１と線Ｃ５−１のどちらか
一方、もしくは、両方とも‘1'の時、線Ｃ１２−
１に‘1'を、そうでない時‘0'を出力する。出力
バツフアOB₁は、FIFOであり、出力識別デコー
ダODEC₁の出力C2−１が‘1'である時、出力レ
ジスタOD₁内のメツセージを線ｌ２−１を介して
読み込み、また、論理和回路LORの出力C12−１
が‘0'である場合、読み込んだ順にメツセージを
線ｌ３−１に出力し、そうでない場合、出力を行
わない。出力セレクタSO₁は、出力識別デコーダ
ODEC₁の出力C1−１が‘0'の場合、出力バツフ
アOB₁の出力l3−１を選択し、出力C1−１が‘1'
の場合、出力レジスタOD₁の出力l2−１を選択
し、線ｌ４−１に出力する。送信レジスタOR₁
は、出力セレクタSO₁の出力l4−１を入力とし、
ネツトワークNETへ線ｌ５−１から出力する。
受信レジスタIR₁は、ネツトワークNETの出力l6
−１からメツセージを入力する。入力識別デコー
ダIDEC₁は、受信レジスタIR₁に入力されたメツ
セージの識別子フイールドf_iをデコードし、至急
転送を要するメツセージの場合、線Ｃ３−１に‘
1'、線Ｃ４−１に‘0'を出力し、そうでない場
合、線３３−１に‘0'、線Ｃ４−１に‘1'を出力
する。。入力バツフアIB₁は、FICOであり、入力
識別デコーダIDEC₁の出力C4−１が‘1'である
時、受信レジスタIR₁のメツセージを線ｌ７−１
から読み込み、また、入力識別デコーダIDEC₁の
出力C4−１が‘0'であり、かつ、CPU₁からの読
み出し指示信号p₁が‘1'の場合、読み込んだ順に
メツセージを線ｌ８−１に出力し、そうでない場
合は出力を行わない。入力セレクタSI₁は、入力
識別デコーダIDEC₁の出力C3−１が‘0'の場合、
入力バツフアIB₁の出力l8−１を選択し、線Ｃ３
−１が‘1'の場合、受信レジスタIR₁の出力l7−
１を選択し、線ｌ９−１に出力する。入力レジス
タID₁は、入力セレクタSI₁の出力l9−１を入力と
し、線ｌ１０−１からCPU₁に出力する。CPU₁
は、線Ｃ３−１が‘1'の場合、これを割り込み信
号としてただちに所定の処理を行う。また、
CPU₁は、ある処理の切れ目、例えば、１つのタ
スクが終了した時に入力バツフアIB₁に蓄積され
ている。処理待ちのメツセージを読み出すため、
読み出し指示信号p₁に‘1'を出力し、メツセージ
の読み出しを行い、これを処理する。以上が第１図に示した並列計算機の各構成要素
の機能の説明である。続いて、この並列計算機の動作を説明する。各
プロセシング・エレメントのCPUには周辺機器
（図示せず）が接続されており、そのうちの端末
装置（図示せず）から、複数のタスクがCPUに
入力される。各プロセシング・エレメント（たと
えばPE₁）は、入力されたタスクを処理するが、
そのCPU（たとえばCPU₁）の負荷が大きくなり
すぎた場合、例えばタスクの実行待ち時間が一定
値より大きくなつた時、他のプロセシング・エレ
メント（たとえばPE_o）の番号と、至急転送を示
す識別子と、問い合わせ元プロセシング・エレメ
ントPE₁の番号を示すデータとからなる。問い合
わせ用のメツセージを作成し、これをネツトワー
クNETを介してそのプロセシング・エレメント
PE_oに送り、タスクの実行待ち時間を問い合わせ
る。そのメツセージを受け取つたプロセシング・
エレメントPE_oでCPU_oに対し、割り込みが発生
し、通常の処理を中断して、ただちに、プロセツ
サCPU_oにおける、タスクの実行待ち時間を示す
データと、至急転送を示す識別子f_iと送信元プロ
セシング・エレメントPE₁の番号とからなる返答
用のメツセージを作成し、返信を行う。ただし、
この時、問い合わせ元のプロセシング・エレメン
トの番号は、受信した問い合わせ用のメツセージ
内のデータの中に書かれているものをそのまま利
用する。その後、問い合わせ元のプロセシング・
エレメントPE₁のCPU₁は、自分CPU₁でのタスク
の実行待ち時間と返信された、他のプロセシン
グ・エレメントPE_oのCPU_oでのタスクの実行待
ち時間とを比較し、自CPU₁でのタスクの実行待
ち時間の方が大きい時は、自CPU₁で処理してい
るタスクを、プログラムおよびデータと共にその
プロセシング・エレメントPE_oに至急転送を要求
する識別子のないメツセージを使つて転送し、そ
のタスクの処理を他プロセシング・エレメント
PE_oに行わせる。そのタスクの処理完了後は、そ
の結果が、至急を要求しないメツセージとして、
元のプロセシング・エレメントPE₁内のプロセツ
サCPU₁に転送される。いずれか他のプロセシン
グ・エレメントへのタスクの転送がタスク処理結
果の転送等のメツセージが、問い合わせ元のプロ
セシング・エレメントPE₁内の出力バツフアOB₁
と、入力バツフアに未処理IB₁あるいは問い合わ
せ先のプロセツシング・エレメントPE_o内の出力
バツフアOB_oあるいは入力バツフアIB_oのまま残
つている場合でも、タスクの実行待ち時間の問い
合わせ用のメツセージと返信用のメツセージは、
至急を要するメツセージとして、それらの出力バ
ツフアOB₁，OB_oや入力バツフアIB₁，IB_oに残つ
ているメツセージよりも優先して転送される。以上に示すように、本実施例では、ネツトワー
クNETが単一であるため、実装する上では、各
プロセシツング・エレメントPE_iとネツトワーク
NETを接続する信号線の本数を従来と変えない
ままプロセツシング・エレメント間のメツセージ
の至急転送を実現できる。なお、本実施例においては、受信側プロセシン
グ・エレメントにおいて、新たに受信したメツセ
ージが至急転送を要するメツセージの場合、入力
識別デコーダ（IDEC_o）においてプロセツサ
（CPU_o）へ割込みを発生し、その処理をただち
に処理するようにしたが、これにかえ、プロセツ
サ（CPU_o）における実行中のタスクの終了後
に、上記新たに受信したメツセージを次に処理す
べきメツセージとしてそのプロセツサ（CPU_o）
に入力するようにしてもよい。勿論、本実施例の
ごとく、割込みを用いる方が至急転送すべきメツ
セージの処理が早くなされるという利点は大き
い。 An embodiment of the present invention will be described below with reference to FIGS. 1 and 2. FIG. 1 shows an example of a parallel computer implementing the present invention. In the diagram, PE ₁ to _{PE o} are processing elements, NET is network, and CPU ₁ to _{CPU o.}
are processors in processing elements PE ₁ to PE _o . OD ₁ to OD _o are output registers,
OB ₁ or OB _o is the output buffer, SO ₁ or SO _o
is the output selector, OR ₁ or OR _o is the receive register, IB ₁ or IB _o is the input buffer, SI ₁ or SI _o
is an input selector, and ID ₁ to ID _o are input registers. ODEC ₁ to ODEC _o are output identification decoders, and IDEC ₁ to IDEC _o are input identification decoders. LOR ₁ to LOR _o are logical sum circuits. Next, the functions of each component shown in FIG. 1 will be explained. NET is a network that transfers a message to a processing element specified in the message, such as a crossbar or omega network. C5−l or C5−
n is for a processing element attempting to send a message to the network NET.
This is a signal output from the network NET that instructs to suppress message transmission. These signals are
'1' if the transfer of a message previously received by the network has not been completed due to network NET blockage; '0' otherwise.
Since the processing elements PE ₁ and PE _o have the same components, the processing element PE ₁ will be explained below as an example. Processor CPU ₁ is
For example, Hitachi's microprocessor H32
It consists of memory and other peripherals connected to it. OD ₁ is an output register through which CPU ₁ receives messages to be transferred to other processing elements from line 11-1. The format of this message is shown in FIG. A message consists of a PE number field f _p indicating the number of the processing element to which the message is transferred, an identifier field f _i indicating whether the message requires urgent transfer, and a message data field f _d . The format of this message is the output register
This format is common to OD ₁ , output buffer OB ₁ , transmission register OR ₁ , reception register IR ₁ , input buffer IB ₁ , and input register ID ₁ . When CPU ₁ writes a message to output register OD ₁ , inhibit signal C5
If -1 is '1', even if the message you are trying to write requires immediate forwarding,
The CPU ₁ suspends writing until the inhibition signal C5-1 becomes '0'. Furthermore, the value of the identifier field f _i is
On CPU ₁ , it can be determined programmatically when creating a message. The output identification decoder ODEC ₁ decodes the identifier field f _i of the message input to the output register OD ₁ , and in the case of a message that requires urgent transfer, it outputs '1' on the line C1-1 and '1' on the line C2-1.
0', otherwise output ' to line C1-1.
0', output '1' to line C2-1. LOR ₁ is an OR circuit, and when one or both of line 1-1 and line C5-1 is '1', line C12-
Outputs '1' for 1 and '0' otherwise. The output buffer OB ₁ is a FIFO, and when the output C2-1 of the output identification decoder ODEC ₁ is '1', it reads the message in the output register OD ₁ via the line l2-1, and also reads the message in the output register OD 1 through the line l2-1. LOR output C12-1
If is '0', messages are output to line l3-1 in the order in which they are read, otherwise they are not output. Output selector SO ₁ is output identification decoder
When output C1-1 of ODEC ₁ is '0', output l3-1 of output buffer OB ₁ is selected and output C1-1 is '1'.
In this case, the output l2-1 of the output register OD ₁ is selected and output to the line l4-1. Transmit register OR ₁
takes the output l4−1 of output selector SO ₁ as input,
Output to network NET from line l5-1.
Receive register IR ₁ is the output l6 of the network NET
Enter the message from -1. The input identification decoder IDEC ₁ decodes the identifier field f _i of the message input into the receiving register IR ₁ and, in the case of a message that requires urgent forwarding, sends an '' to the line C3-1.
1', outputs '0' to line C4-1; otherwise, '0' is output to line 33-1, and '1' is output to line C4-1. . The input buffer IB ₁ is FICO, and when the output C4-1 of the input identification decoder IDEC ₁ is '1', the message of the receiving register IR ₁ is sent to the line l7-1.
If the output C4-1 of the input identification decoder IDEC ₁ is '0' and the read instruction signal _p1 from the CPU ₁ is '1', the messages are sent to the line l8-1 in the order in which they are read. output, otherwise do not output. When the output C3-1 of the input identification decoder IDEC ₁ is '0', the input selector SI ₁
Select output l8-1 of input buffer IB ₁ and connect line C3
If −1 is '1', the output l7− of receiving register IR ₁
1 is selected and output to line l9-1. The input register ID ₁ receives the output l9-1 of the input selector SI ₁ and outputs it to the CPU ₁ from the line l10-1. CPU ₁
When the line C3-1 is '1', this is used as an interrupt signal and a predetermined process is immediately performed. Also,
The CPU ₁ stores data in the input buffer IB ₁ at the end of a certain process, for example, when one task is completed. To read messages waiting to be processed,
It outputs '1' to the read instruction signal _p1 , reads the message, and processes it. The above is an explanation of the functions of each component of the parallel computer shown in FIG. Next, the operation of this parallel computer will be explained. Peripheral devices (not shown) are connected to the CPU of each processing element, and a plurality of tasks are input to the CPU from a terminal device (not shown) among them. Each processing element (e.g. PE ₁ ) processes an input task, but
When the load on that CPU (for example, CPU ₁ ) becomes too large, for example, when the execution waiting time of a task becomes larger than a certain value, the number of another processing element (for example, PE _o ) and an identifier indicating urgent transfer are sent. and data indicating the number of the processing element PE ₁ that made the inquiry. Creates a message for the inquiry and sends it to its processing element via the network NET.
PE _o to inquire about the task execution wait time. The processing that received the message
An interrupt occurs to CPU _o in element PE _o , interrupts normal processing, and immediately sends data to processor CPU _o indicating the task execution waiting time, an identifier f _i indicating urgent transfer, and the source processing.・Create a reply message consisting of the number of element PE ₁ and send the reply. however,
At this time, as the number of the processing element that is the inquiry source, the number written in the data in the received inquiry message is used as is. The queryer's processing
CPU ₁ of element PE ₁ compares the execution waiting time of the task on its own CPU ₁ with the returned execution waiting time of the task on CPU _o of another processing element PE _o , and determines the execution waiting time of the task on its own CPU ₁ . When the execution waiting time of a task is longer, the task being processed by its own CPU ₁ is transferred along with the program and data to its processing element PE _o using a message without an identifier requesting urgent transfer. Processing tasks to other processing elements
Let PE _o do it. After the task is completed, the result is sent as a non-urgent message.
Transferred to processor CPU ₁ in original processing element PE ₁ . Messages such as transfer of task processing results to any other processing element are sent to the output buffer OB ₁ in the processing element PE ₁ that is the inquirer.
, even if the input buffer remains unprocessed IB ₁ or the output buffer OB o or input buffer _IB _o in the processing element PE _o to be queried, a message for inquiring about the execution waiting time of the task and a reply message are sent. The message of
As urgent messages, they are transferred with priority over messages remaining in the output buffers OB ₁ and OB _o and the input buffers IB ₁ and IB _o . As shown above, in this embodiment, there is a single network NET, so when implementing it, each processing element PE _i and network
It is possible to quickly transfer messages between processing elements without changing the number of signal lines connecting NET. In this embodiment, in the receiving processing element, if a newly received message requires urgent transfer, the input identification decoder (IDEC _o ) generates an interrupt to the processor (CPU _o ), and the process is started. However, instead of this, after the task being executed in the processor (CPU _o ) is finished, the newly received message is processed by that processor (CPU _o ) as the next message to be processed.
You may also enter it in Of course, using an interrupt as in this embodiment has a great advantage in that a message that needs to be transferred immediately can be processed more quickly.

【Effect of the invention】

本発明によれば、プロセシング・エレメント間
で、至急なやり取りを行わなければならないメツ
セージがより速やかに転送できる。 According to the present invention, messages that must be exchanged urgently can be transferred more quickly between processing elements.

[Brief explanation of the drawing]

第１図は、本発明の一実施例の並列計算機の構
成図、第２図は、第１図の並列計算機で用いられ
るメツセージの形式図である。 NET……ネツトワーク、PE₁，PE_o……プロセ
シング・エレメント、CPU₁，CPU_o……計算機、
OD₁，OD_o……出力レジスタ、ODEC₁，ODEC_o
……出力識別デコーダ、OB₁，OB_o……出力バツ
フア、SO₁，SO_o……出力セレクタ、OR₁，OR_o
……送信レジスタ、IR_1，IR_o……受信レジスタ、
IDEC₁，IDEC_o……入力識別デコーダ、IB₁，IB_o
……入力バツフア、SI₁，SI_o……入力セレクタ、
ID₁，ID_o……入力レジスタ、LOR₁，LOR_o……
論理和回路。 FIG. 1 is a block diagram of a parallel computer according to an embodiment of the present invention, and FIG. 2 is a format diagram of messages used in the parallel computer of FIG. NET...Network, PE ₁ , PE _o ...Processing element, CPU ₁ , CPU _o ...Computer,
OD ₁ , OD _o ... Output register, ODEC ₁ , ODEC _o
...Output identification decoder, OB ₁ , OB _o ...Output buffer, SO ₁ , SO _o ...Output selector, OR ₁ , OR _o
...Transmit register, IR _1, IR _o ...Receive register,
IDEC ₁ , IDEC _o ...Input identification decoder, IB ₁ , IB _o
...Input buffer, SI ₁ , SI _o ...Input selector,
ID ₁ , ID _o ...Input register, LOR ₁ , LOR _o ...
OR circuit.

Claims

[Scope of Claims] 1 Consists of a plurality of processing elements and a network that transfers messages between them, and each processing element has a processor and a state in which the output from the processor is waiting for transmission to the network. a transmission message storage means for storing messages in the network; and a first transfer control means for sequentially transmitting the stored messages waiting to be sent to the network in response to the fact that a message transmission inhibition signal is not input from the network. and,
Received message storage means for storing messages received from the network and waiting to be processed by the processor, and sequentially transmitting the stored received messages to the processor in response to a received message read request from the processor. and a second transfer control means, the first transfer control means detecting whether a message newly output from the processor is a message that should be transferred immediately by using a message identifier included in the message. If the new message is a message that should be transferred immediately, the new message should be sent next to the network, giving priority to the messages already stored in the sending message storage means. a first selection means for selecting a message, and the second transfer control means determines whether a message newly received from the network is a message that should be urgently transferred. If the newly received message is a message that should be forwarded immediately, it is processed next by the processor in priority to messages waiting to be processed stored in the received message storage means. A parallel computer characterized by having second selection means for selecting the message as an appropriate message. 2 Each time the processor finishes executing a task,
The second selection means reads the received message from the received message storage means, and when the newly received message is a message that should be forwarded immediately, the second selection means selects the second selection means to request processing of the message. 2. A parallel computer according to claim 1, further comprising means for interrupting a processor.