JPH02196385A

JPH02196385A - Data driving type computer

Info

Publication number: JPH02196385A
Application number: JP1925689A
Authority: JP
Inventors: Nobufumi Komori; 伸史小守; Hironori Tsubota; 浩乃坪田; Kenji Shima; 憲司嶋
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1989-01-26
Filing date: 1989-01-26
Publication date: 1990-08-02

Abstract

PURPOSE:To reduce a hardware scale while omitting a redundant input address latch and a decoder, etc. by integrating an instruction fetching part and a desti nation designating part in one program memory, and placing this program memory in parallel to an instruction executing part. CONSTITUTION:A program memory 11 in which both an instruction code and a destination node number are stored is placed in parallel to an instruction executing part 12. Accordingly, an execution processing of the present instruction, and a readout processing of the next instruction and the destination node number are executed in parallel, and by total two cycles of this cycle and a data waiting cycle in a coincidence detecting part, the execution of one instruction is completed. In such a way, the data processing time is shortened, and also, a data driving type computer in which an unnecessary hardware is curtailed can be formed.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、処理対象のデータ相互間の依存関係に基づい
て処理を駆動する方式を採用している、所謂データ駆動
形計算機に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a so-called data-driven computer that employs a method of driving processing based on mutual dependencies between data to be processed.

［従来の技術〕従来のデータ駆動形計算機の代表的な例が、科学技術用
高速計算システム研究開発成果発表会講演予稿集（昭和
５９年６月２５Ｂ）の１９頁から２２頁及びＩＥ肺ＣＯ
ＭＰＣＯＮ　’８４５ＰＲＩＮＧ予稿集（英文）の４８
６頁から４９０頁に開示されている。これらの公知資料
は、いずれもＳｉｇｍａ−１と呼ばれるデータ駆動形計
算機の構成と動作について述べている。以下、これらの
公知資料に基づいて、従来技術を説明する。[Prior art] Typical examples of conventional data-driven computers are pages 19 to 22 of the Proceedings of the Conference on Research and Development Results of High-Speed Computing Systems for Science and Technology (June 25, 1980) and IE Pulmonary CO.
MPCON '845PRING Proceedings (English) 48
It is disclosed on pages 6 to 490. All of these publicly known materials describe the configuration and operation of a data-driven computer called Sigma-1. The prior art will be described below based on these publicly known materials.

第５図は上述の従来のデータ駆動形計算機の構成を示し
ている。FIG. 5 shows the configuration of the conventional data-driven computer mentioned above.

このデータ駆動形計算機は、計算器外部とのインタフェ
イスであるネソトワークインタフェース１４、処理対象
の各データから行先ノード番号が一致する二つのデータ
を検出するマツチングメモリ１０、ノード番号に基づい
て演算命令のコードを取出す命令取出部１５、命令コー
ドに従って演算処理を実行する命令実行部１２、演算処
理後のデータの次の行先ノード番号を指定するための行
先指定部１６等とこれらを接続するデータ経路から構成
されている。This data-driven computer includes a network interface 14 that is an interface with the outside of the computer, a matching memory 10 that detects two pieces of data with matching destination node numbers from each data to be processed, and a calculation based on the node number. An instruction fetching section 15 that extracts the instruction code, an instruction execution section 12 that executes arithmetic processing according to the instruction code, a destination specifying section 16 that specifies the next destination node number of data after the arithmetic processing, and data that connect these. It consists of routes.

また、第４図は従来のデータ駆動形計算機の具体的な動
作を説明するためのプログラム（データフローグラフ）
の例である。In addition, Figure 4 is a program (data flow graph) for explaining the specific operation of a conventional data-driven computer.
This is an example.

このデータフローグラフは、データＡとＢとをノード＃
０において加算し、この結果のデータＧと別のデータＣ
とをノード＃２において乗算してその結果データ■を求
め、他方ではデータＤとＥとをノート＃１において加算
し、この結果のデータＨと別のデータＦとをノード＃４
において乗算してその結果データＪを求め・・・、デー
タＷとＸとをノード＃　７０９６において乗算し、この
結果データＹと先に求められているデータＩとをノード
＃７０９７において加算してその結果データＺを求め・
・・というように構成されている。This data flow graph connects data A and B to node #
0, and add this resultant data G and another data C
are multiplied at node #2 to obtain the resulting data ■, and on the other hand, data D and E are added at note #1, and the resulting data H and another data F are added to node #4.
Multiply at node #7096 to obtain the resultant data J... Then, multiply data W and Find the result data Z.
It is structured as follows.

外部からネットワークインタフェース１４を介して入力
されたパケット（タグ情報を有するデータ）Δ（２０１
）は、行先ノード番号〈０〉とデータ［Ａ］とから構成
されている。このパケット２０１は、マツチングメモリ
１０に送られるが、これと同時に、パケット２０１の内
の行先ノード番号＜　Ｏ＞　（２０３）は命令取出部１
５にも送られる。データ［Ａ］の２項演算の相手となる
データ［Ｂ］を有するパケットＢが既に入力されてマツ
チングメモリ１０内で待機していると仮定すると、これ
ら２つのパケットＡ、Ｂの行先ノードは共に＃０で一致
するため、マツチングメモリ１０はデータ［Ａ］　とデ
ータ［Ｂ］とを対にしたパケット２０６を出力する。A packet (data having tag information) input from the outside via the network interface 14 Δ(201
) is composed of destination node number <0> and data [A]. This packet 201 is sent to the matching memory 10, but at the same time, the destination node number <O> (203) in the packet 201 is sent to the matching memory 10.
It will also be sent to 5. Assuming that packet B containing data [B], which is the target of a binary operation on data [A], has already been input and is waiting in the matching memory 10, the destination nodes of these two packets A and B are Since they both match #0, the matching memory 10 outputs a packet 206 that is a pair of data [A] and data [B].

一方、命令取出部１５では、ノード番号〈０〉に相当す
るアドレスの内容である命令コード「＋」（２０５）が
読出されて出力される。On the other hand, the instruction fetching unit 15 reads and outputs the instruction code "+" (205), which is the content of the address corresponding to node number <0>.

命令取出部１５のメモリ構成を第６図（ａｌに示す。The memory configuration of the instruction fetching unit 15 is shown in FIG. 6 (al).

次に、命令取出部１５に入力された行先ノード番号＜Ｑ
＞は、そのままノード番号＜　Ｏ＞　（２０）として行
先指定部１６に送られる。また、命令取出部１５で読み
出された命令コードｒ　＋　Ｊ　（２０５）は、データ
対［Ａ］、　　［Ｂ］のパケット２０６と共にパケット
２０７として命令実行部１２に与えられる。このとき、
行先指定部１６においては、第４図のデータフローグラ
フ中のノード番号〈０〉に対応する演算である加算の結
果データが次に行くべきノード１２を表す＜　２　＞　
（２０８）が読出される。Next, the destination node number input to the instruction fetching unit 15 <Q
> is sent as is to the destination specifying unit 16 as the node number <O> (20). Further, the instruction code r + J (205) read out by the instruction extraction unit 15 is given to the instruction execution unit 12 as a packet 207 together with the packet 206 of the data pair [A] and [B]. At this time,
In the destination specifying unit 16, the result data of the addition which is the operation corresponding to the node number <0> in the data flow graph of FIG. 4 represents the next node 12 to go to <2>.
(208) is read out.

行先指定部１６のメモリ構成を、第６図中）に示す。The memory configuration of the destination specifying unit 16 is shown in FIG.

また同時に、命令実行部１２においては［Ａ］＋［Ｂ］
の演算が行われ、その演算結果データ［Ｇ］（２０９）
が出力される。行先指定部１６の出力２０８と命令実行
部１２の出力２０９と、即ち行先ノード番号〈０〉とデ
ータ［Ｇ］とはパケット２１０としてネットワークイン
タフェース１４を通過し、再び命令取出部１５及びマツ
チングメモリ１０に送られる。At the same time, in the instruction execution unit 12, [A]+[B]
The calculation is performed, and the calculation result data [G] (209)
is output. The output 208 of the destination designation unit 16 and the output 209 of the instruction execution unit 12, that is, the destination node number <0> and data [G], pass through the network interface 14 as a packet 210, and are sent to the instruction extraction unit 15 and the matching memory again. Sent to 10.

以上の様な処理の連鎖によって、第４図に示したデータ
フローグラフの全ノードに相当する演算が施され、プロ
グラムの実行が終了する。このとき、データ依存関係が
存在するノード、例えば、ノード＃０とノード＃２とに
おける処理はこの順に逐次的にのみ実行可能であるが、
データ依存関係が存在しないノード、例えば、ノード＃
Ｏとノード＃１とにおける処理は処理資源の許す限りに
おいて並列に実行可能である。Through the chain of processing as described above, operations corresponding to all nodes of the data flow graph shown in FIG. 4 are performed, and the execution of the program is completed. At this time, the processes at the nodes where there is a data dependency relationship, for example, node #0 and node #2, can only be executed sequentially in this order.
Nodes where no data dependencies exist, e.g. node #
Processing in O and node #1 can be executed in parallel as far as processing resources allow.

なおここで、データ依存関係とは、二つのノード間の接
続関係において、一方のノードの処理が完了することに
よってはしめて他方の処理を行うために必要な入力デー
タが供給されるような接続関係にあることを指している
。Note that a data dependency relationship here refers to a connection relationship between two nodes in which the completion of the processing of one node immediately supplies the input data necessary to perform the processing of the other node. It refers to something in

命令取出部１５と行先指定部】６のメモリ構成を、前述
の如く、第６図（ａｌ及び（ｂ）にそれぞれ示す。各々
の情報のビット幅は、公知資料では明確に記述されてい
ないので、ここでは、命令コードのビット幅を４ビツト
、行先ノード番号のビット幅を１６ビツトと仮定した。The memory configurations of the instruction fetch unit 15 and the destination designation unit 6 are shown in FIGS. Here, it is assumed that the bit width of the instruction code is 4 bits and the bit width of the destination node number is 16 bits.

また、上記公知資料には、データコピー操作に関する具
体的な記述がないので、データコピーについては広く一
般的に知られている手法を用いることを仮定した。Furthermore, since the above-mentioned publicly known materials do not have any specific description regarding data copy operations, it was assumed that a widely and generally known method would be used for data copying.

行先指定部１６内のメモリにおける最上位ビット“ｃｏ
ｐｙ″はコピービットであり、第４図に示したデータフ
ローグラフ中のあるノードにおける演算結果が複数のノ
ードに送出される場合に、データコピーを行ってそれぞ
れのデータに対して行先ノード番号を与えるために用い
られている。例えば、ノード＃２における演算結果がノ
ード＃　７０９７とノード舅５との両方に送出されてい
るのに対応して、上記メモリの第２番地のコピービット
は“１″が格納されており、この場合、行先ノード番号
＃　７０９７が読出されて演算結果［１１に付加されて
出力された後、連続的に次の戦地（第３番地）が読出さ
れて行先ノード番号Ｉ５が同一の演算結果［１に付加さ
れて出力される。The most significant bit “co” in the memory in the destination designation unit 16
py'' is a copy bit, and when the calculation result at a certain node in the data flow graph shown in Figure 4 is sent to multiple nodes, data is copied and the destination node number is assigned to each data. For example, in response to the calculation result at node #2 being sent to both node #7097 and node 5, the copy bit at the second address of the memory is set to “1”. In this case, the destination node number #7097 is read out and added to the calculation result [11] and output, and then the next battlefield (3rd address) is read out continuously and the destination node number is I5 is added to the same calculation result [1 and output.

[Problem to be solved by the invention]

ところが、従来のデータ駆動形計算機には、二つの問題
点がある。However, conventional data-driven computers have two problems.

第１の問題点は、プログラムメモリの規模が大きいこと
である。現在、広く一般に用いられているノイマン形計
算機の場合、プログラムに記述された順序に従って逐次
処理を行うので、プログラムメモリクと呼ばれる一個の
レジスタで実行番地が一元的に管理されている。従って
、分岐命令以外は次に実行すべき命令の番地（飛び先番
地）を特に指定する必要がなく、データ駆動形計算機に
比して小さい規模のプログラムメモリで同一内容のプロ
グラムを格納することができる。これに対して、データ
駆動形計算機の場合には、各命令の処理を並列的に実行
するため、実行番地を一元的に管理することが不可能で
ある。このため、データ駆動形計算機では、原理的に総
ての命令に対して次に実行すべき命令の番地（行先ノー
ド番号）を指定しておく必要があり、プログラムメモリ
の規模が拡大する原因となっている。上述の従来例にお
いても、１命令当りのビット幅２１ビツト（命令コード
４ビツト＋行先ノード番号１７ビソト）の内、１６ビソ
トを行先ノード番号が占めている。The first problem is that the program memory is large. In the case of the Neumann computer which is currently widely used, processing is performed sequentially according to the order written in the program, so the execution address is centrally managed by a single register called a program memory. Therefore, except for branch instructions, there is no need to specify the address of the next instruction to be executed (jump address), and programs with the same content can be stored in a smaller program memory than data-driven computers. can. On the other hand, in the case of a data-driven computer, each instruction is processed in parallel, so it is impossible to centrally manage execution addresses. For this reason, in data-driven computers, it is necessary in principle to specify the address of the next instruction to be executed (destination node number) for every instruction, which causes the size of the program memory to expand. It has become. In the above-mentioned conventional example as well, the destination node number occupies 16 bits out of the 21-bit bit width (instruction code 4 bits + destination node number 17 bits) per instruction.

第２の問題点は、プログラムメモリが、命令取り出し部
と行先指定部とに分離されていることである。このため
、第６図ＦａＬ　（ｂｌに示されているように、あるノ
ードに着目してみると、これら二つのメモリに対する入
力アドレスが全く同一であるにも拘わらず、両者が別々
のパイプライン段に配置されているため、アクセスが同
時に行われずに不要な処理時間を要し、また入力アドレ
スラノチ及びアドレスデコーダを独立に２組備える必要
があり、ハードウェア規模を不要に大きくすることにな
る。The second problem is that the program memory is separated into an instruction fetch section and a destination specification section. For this reason, as shown in Figure 6 FaL (bl), if we focus on a certain node, even though the input addresses to these two memories are exactly the same, they are in different pipeline stages. Since accesses are not performed simultaneously, unnecessary processing time is required, and it is necessary to provide two independent sets of input address registers and address decoders, which unnecessarily increases the hardware scale.

上述の第１の問題点に関しては、本願発明者らは先に特
願昭６３−００００００号の発明にてその解決を目的と
したデータ駆動形計算機を提案している。Regarding the first problem mentioned above, the inventors of the present application have previously proposed a data-driven computer aimed at solving the problem in the invention of Japanese Patent Application No. 63-000000.

本発明は、上記第２の問題点に鑑み、データ処理時間を
短縮すると共に、不要なハードウェアを削減したデータ
駆動形計算機の提供を目的とする。In view of the second problem, the present invention aims to provide a data-driven computer that reduces data processing time and eliminates unnecessary hardware.

[Means to solve the problem]

本明細書のデータ駆動形計算機は、従来のデータ駆動形
計算機の命令取出し機能と行先指定機能とを一つのプロ
グラムメモリに統合し、このプログラムメモリを命令実
行部と並列配置することにより、冗長な入力アドレスラ
ンチあるいはデコーダを排除し、ハードウェア規模を低
減している。The data-driven computer of this specification integrates the instruction fetch function and destination specification function of conventional data-driven computers into one program memory, and arranges this program memory in parallel with the instruction execution section, thereby eliminating redundant functions. The input address launcher or decoder is eliminated, reducing the hardware scale.

また、プログラムメモリと命令実行部とを並列配置する
ことにより、一つの命令を実行するための基本サイクル
を■マツチングメモリと■（プログラムメモリ＋命令実
行部）の二つのパイプライン段にて構成している。In addition, by arranging the program memory and instruction execution section in parallel, the basic cycle for executing one instruction consists of two pipeline stages: ■ matching memory and ■ (program memory + instruction execution section). are doing.

[Effect]

本発明に係るデータ駆動形計算機においては、命令コー
ド及び行先ノード番号の両方を格納したプログラムメモ
リを命令実行部と並列配置しているので、現在の命令の
実行処理と、次の命令及び行先ノード番号の読出し処理
とが並列して行われ、このサイクルと一致検出部におけ
るデータ待合わせのサイクルとの計２サイクルで一つの
命令の実行が完了される。In the data-driven computer according to the present invention, the program memory storing both the instruction code and the destination node number is arranged in parallel with the instruction execution unit, so that the execution process of the current instruction and the next instruction and destination node The number reading process is performed in parallel, and execution of one instruction is completed in a total of two cycles: this cycle and the data waiting cycle in the coincidence detection section.

[Embodiments of the invention]

以下、本発明をその実施例を示す図面に基づいて詳述す
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described in detail below based on drawings showing embodiments thereof.

第１図は本発明に係るデータ駆動形計算機の一実施例の
構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of a data-driven computer according to the present invention.

本発明のデータ駆動形計算機は、外部とのインクフェイ
スであるネットワークインタフェイス１４、処理対象の
各データから行先ノード番号が一致する二つのデータを
検出する一致検出部としてのマツチングメモリ１０、ノ
ード番号の基づいて演算命令のコードを取出す機能と演
算処理後のデータの次の行先ノード番号を指定するため
の二つの機能を有するプログラムメモリ１１、命令コー
ドに従って演算処理を実行する演算処理部としての命令
実行部１２、行先ノード番号の更新処理のための演算手
段としての加算器１３等とこれらを接続するデータ経路
にて構成されている。The data-driven computer of the present invention includes a network interface 14 which is an ink interface with the outside, a matching memory 10 which is a matching detection unit which detects two pieces of data having matching destination node numbers from each data to be processed, and a node A program memory 11 has two functions: one to extract the code of an arithmetic instruction based on the number, and the other to specify the next destination node number of the data after the arithmetic processing. It is composed of an instruction execution unit 12, an adder 13 as an arithmetic means for updating the destination node number, and a data path connecting them.

このような本発明のデータ駆動形計算機により、第３図
に示すプログラム（データフローグラフ）を実行する場
合について動作を説明する。なお、第３図に示すデータ
フローグラフは、前述の第４図のプログラムと全く同一
のプログラムであるが、本発明の説明のためにノード番
号の割付は方を異ならせである。The operation of the data-driven computer of the present invention when executing the program (data flow graph) shown in FIG. 3 will be described. The data flow graph shown in FIG. 3 is the same program as the program shown in FIG. 4 described above, but the node numbers are assigned differently for the purpose of explaining the present invention.

このデータフローグラフは、データＡとＢとをノード＃
Ｏにおいて加算し、この結果のデータＧと別のデータＣ
とをノード＃２において乗算してその結果データＩを求
め、他方ではデータＤとＥとをノード＃１において加算
し、この結果のデータＧと別のデータＣとをノード＃５
において乗算してその結果データＪを求め・・・、デー
タＷとＸとをノード＃７０９６において乗算し、この結
果データＹと先に求められているデータ■とをノード＃
　７０９７において加算してその結果データＺを求め・
・・というように構成されている。This data flow graph connects data A and B to node #
O, add this result data G and another data C
are multiplied at node #2 to obtain the resulting data I, and on the other hand, data D and E are added at node #1, and the resulting data G and another data C are added to node #5.
Multiply at node #7096 to obtain the resultant data J... Data W and X are multiplied at node #7096, and this resultant data Y and the previously obtained data
7097 to calculate the resultant data Z.
It is structured as follows.

外部からネットワークインタフェース】４を経由して入
力されたパケット１０１は、行先ノード番号〈０〉、デ
ータ［Ａコ及び命令コード「＋ｊを含んでいる。A packet 101 input from the outside via the network interface 4 includes a destination node number <0>, data [A], and an instruction code ``+j''.

本発明のデータ駆動形計算機においては、２サイクルの
パイプライン処理が行われる。その初段はマツチングメ
モリ１０におけるデータの待合わせである。外部から入
力されたパケット１０１は、マツチングメモリ１０に送
られるが、この際、２項演算「＋」の相手のデータ［Ｂ
］を含むパケットが未到着の場合には、入力されたパケ
ット１０１はマツチングメモリ内に一旦格納され、デー
タ［Ｂ］を含むパケットの到着を待つ。また、既にもう
一方のデータ［Ｂ］を含むパケットが到着してマツチン
グメモリ内に格納されている場合には、これら二つのパ
ケットの行先ノード番号〈０〉が一致していることが検
出され、入力パケット１０１にデータ［Ｂ］を付加した
パケット１０２がマツチングメモリ１０から送出される
。In the data-driven computer of the present invention, two-cycle pipeline processing is performed. The first stage is the waiting of data in the matching memory 10. A packet 101 input from the outside is sent to the matching memory 10, but at this time, the data [B
] has not arrived, the input packet 101 is temporarily stored in the matching memory and waits for the packet containing data [B] to arrive. Additionally, if a packet containing the other data [B] has already arrived and is stored in the matching memory, it is detected that the destination node numbers <0> of these two packets match. , a packet 102 obtained by adding data [B] to the input packet 101 is sent out from the matching memory 10.

パケット１０２に含まれている情報の内、行先ノード番
号＜　Ｑ　＞　（１０３）はプログラムメモリ１１及び
後述する加算器１３に送られ、残りのデータ［Ａ］［Ｂ
］及び命令コード「＋」は、パケット１０４として命令
実行部１２に送られる。Among the information contained in the packet 102, the destination node number <Q> (103) is sent to the program memory 11 and the adder 13, which will be described later, and the remaining data [A] [B
] and the instruction code “+” are sent to the instruction execution unit 12 as a packet 104.

命令実行部１２では、命令コード「＋」に従ってデータ
［Ａ］　と［Ｂ］との加算が行われ、加算結果データ［
Ｇ］　（１０７）が出力される。In the instruction execution unit 12, data [A] and [B] are added according to the instruction code "+", and the addition result data [
G] (107) is output.

一方、命令実行部１２における処理と並列して実行され
るプログラムメモリ１１での処理について、以下に説明
する。On the other hand, the processing in the program memory 11 that is executed in parallel with the processing in the instruction execution unit 12 will be described below.

第２図は、本発明に係るデータ駆動型計算機のプログラ
ムメモリ１１のメモリ構成を示している。FIG. 2 shows the memory configuration of the program memory 11 of the data-driven computer according to the present invention.

第２図のメモリにおいて、各ワードの最上位ビ・７ト“
ＥＸＴ″は後述する拡張アドレスフラグである。In the memory shown in Figure 2, the seven most significant bits of each word "
EXT'' is an extended address flag to be described later.

その下位側のビソビＣ０ＰＹ”は、従来例におりる“ｃ
ｏｐｙ”ビットと同様の意味を有し、この“Ｃ０ＰＹ”
ビットが“１”であれば、演算結果の行先ノードが複数
個あり、これらの各行先ノードに関する命令コードと行
先ノード番号の情報とが次の番地以降に格納されている
ことを示している。The lower side Bisobi C0PY” is “c0PY” in the conventional example.
This “C0PY” bit has the same meaning as the “opy” bit.
If the bit is "1", this indicates that there are a plurality of destination nodes for the operation result, and that the instruction code and destination node number information regarding each of these destination nodes are stored at the next address or later.

この“ｃｏｐｙ″ビットに続く下位側４ピツ１〜には、
次のサイクルで実行すべき命令コード（ＯＰＥＣＯＤＥ
）が格納されており、その更に下位側の６ビツトには、
行先ノード番号が現在のノード番号に対する相対番地（
ＲＮＯＤＥ＞で格納されている。The lower 4 bits 1~ following this “copy” bit are:
Instruction code (OPECODE) to be executed in the next cycle
) is stored in the lower 6 bits,
If the destination node number is a relative address to the current node number (
RNODE>.

上述の如き本発明のデータ駆動型計算機の具体的な動作
を説明する。The specific operation of the data-driven computer of the present invention as described above will be explained.

パケット１０］　がマツチングメモリ１０において一致
検出されてパケット１０２が出力されると、パケット１
０２のノード番号＜　Ｏ＞　（１０３）がプログラムメ
モリ１１に入力される。これにより、第２図に示したメ
モリの第０番地がアクセスされて、命令コード「＋」と
、相対行先ノード番号く２〉が読出されて、命令コード
レジスタ（ＯＰＥ−ＲＥＧ）　１３１及び行先ノード番
号レジスタ（ＤＥＳＴ−ＲＥＧ）　１３２にそれぞれ書
込まれる。When packet 10] is detected as a match in matching memory 10 and packet 102 is output, packet 1
The node number <O> (103) of 02 is input to the program memory 11. As a result, address 0 of the memory shown in FIG. 2 is accessed, the instruction code "+" and the relative destination node number 2> are read out, and the instruction code register (OPE-REG) 131 and the destination node are read out. Each number is written to the number register (DEST-REG) 132.

命令コードは命令コードレジスタ１３１からそのまま出
力される。一方、相対行先ノード番号く２〉は行先ノー
ド番号レジスフ１３２から加算器１３に送られ、加算器
１３において、先に入力されている入力ノード番号＜　
Ｏ＞　（１０３）と加算されて行先ノード番号の絶対値
＜　２　＞　（１０６）が得られる。The instruction code is output as is from the instruction code register 131. On the other hand, the relative destination node number <2> is sent from the destination node number register 132 to the adder 13, and in the adder 13, the input node number <<
O> (103) is added to obtain the absolute value of the destination node number <2> (106).

このようにして得られた命令コード「＋」１行先ノード
番号〈２〉及び命令実行部１２から出力されるデータ［
Ｇ］にて構成されるパケット１０Ｂが再びネソＩ・ワー
クインタフェイス１４を経由してマツチングメモリ１０
に供給され、このような処理の反復によって、プログラ
ムの全体が実行される。The instruction code “+” obtained in this way, the one-line destination node number <2>, and the data output from the instruction execution unit 12 [
The packet 10B configured in
The entire program is executed by repeating this process.

但し本発明では、相対行先ノード番号を格納するプログ
ラムメモリ１１のメモリのビット幅を６ビソトに制限し
ているために、実際の相対行先ノード番号の値が２５５
（＝　２！　−１）を越えた場合に桁溢れが発生する。However, in the present invention, since the memory bit width of the program memory 11 that stores the relative destination node number is limited to 6 bits, the actual value of the relative destination node number is 255 bits.
(= 2! -1), overflow occurs.

例えば、第３図のデータフローグラフにおけるノード＃
２からノード＃　７０７９への矢印に対応する相対行先
ノード番号がこれに相当する。このような場合には、各
ノード番号に対応するプログラムメモリの最上位ビット
“ＥＸＴ″を′１″　とすることによって、溢れた桁を
格納するための拡張アドレスビット領域として次の番地
を用いることが可能なように本発明では構成しである。For example, node # in the data flow graph in Figure 3
This corresponds to the relative destination node number corresponding to the arrow from 2 to node #7079. In such a case, by setting the most significant bit "EXT" of the program memory corresponding to each node number to '1', the next address can be used as an extended address bit area to store the overflow digit. The present invention is configured so that this is possible.

第３図のノード１１２における処理に必要な入力データ
［Ｇ］　と１Ｃ］　とが揃い、マツチングメモリ１０か
ら行先ノード番号く２〉、命令コード「＋」データ［Ｇ
］及び［Ｃ］にて構成されるパケットが出力された場合
について説明する。The input data [G] and 1C] necessary for the processing at the node 112 in FIG.
] and [C] is output.

上述の如く、命令コードとデータは命令実行部１２に送
られて乗算が実行され、結果データ［１］が出力される
。これと並行して、行先ノード番号〈２〉がプログラム
メモリＩＩに送出され、行先ノ−ド番号が読出される。As described above, the instruction code and data are sent to the instruction execution unit 12, multiplication is executed, and result data [1] is output. In parallel with this, the destination node number <2> is sent to the program memory II, and the destination node number is read out.

入力されたノード番号が＜Ｎ＞であるときのプログラム
メモリ１１の詳細な論理動作を、第７図のフローチャー
トに示す。The detailed logical operation of the program memory 11 when the input node number is <N> is shown in the flowchart of FIG.

このフローチャート中において、ｒＲＥＡＤ　ＭＪはメ
モリのＭ番地を読出す動作、ピッｌ−（６：　０）等は
ビット６からビット０にて構成されているビット列、”
　ＥＸＴ″、“ｃｏｐｙ”はプログラムメモリ内の拡張
アドレスフラグビットとコピーフラグビットとの値をそ
れぞれ表わしている。In this flowchart, rREAD MJ is an operation to read address M of the memory, and pin-(6:0), etc. are bit strings consisting of bits 6 to 0.
EXT" and "copy" respectively represent the values of the extended address flag bit and the copy flag bit in the program memory.

行先ノード番号〈２〉を有するパケットがプログラムメ
モリ１１に入力されると、プログラムメモリ１１の第２
番地の内容が読出される（ステップＳ４）。When a packet with destination node number <2> is input to the program memory 11, the second
The contents of the address are read (step S4).

読出された内容のうち、命令コード「＋」が命令コード
レジスタ１３１に、また相対行先ノード番号〈４〉が行
先ノード番号レジスタ１３２の下位６ビソトに、即ちビ
ット（６：　Ｏ）にそれぞれ書込まれる（ステップＳ５
．Ｓ６）。Of the read contents, the instruction code "+" is written to the instruction code register 131, and the relative destination node number <4> is written to the lower 6 bits of the destination node number register 132, that is, bit (6: O). (Step S5
．． S6).

読出されたワードの“ＥＸＴ”ビットは“０″なので（
ステップＳ７）、命令コード及び行先ノード番号レジス
タの値〈４〉と入力されたノード番号〈２〉との加算器
１３による加算結果の〈６〉が行先ノード番号としで出
力される（ステップＳ８）。Since the “EXT” bit of the read word is “0” (
Step S7), the result of adding the instruction code and destination node number register value <4> with the input node number <2> by the adder 13, <6>, is output as the destination node number (Step S8) .

この際、読出されたワードの“ｃｏｐｙ″ビットが“１
″なので、読出しアドレス〈２〉をインクリメントする
と共に（ステップ５１０）、ループカウント＜Ｌ＞と行
先ノード番号レジスフ１３２の内容をクリアした後（ス
テップＳ２．　Ｓ３）、インクリメント後のアドレスで
ある第３番地を読出す（ステップＳ４）。At this time, the “copy” bit of the read word is “1”.
'', the read address <2> is incremented (step 510), and after clearing the loop count <L> and the contents of the destination node number register 132 (steps S2 and S3), the third address which is the address after the increment is is read out (step S4).

初回の読出しの際と同様に、読出したワード中の命令コ
ード「＋」と相対行先ノード番号〈５５〉（−“１１０
１１１”）　とを、命令コードレジスタ１３１と行先ノ
ード番号レジスタ１３２の下位６ビノトにそれぞれ書込
む（ステップＳ５．Ｓ６）。読出したワードの“ＥＸＴ
”ビットが“１”であるので（ステップＳ７）、読出し
アドレスをインクリメントして（ステップ５１１）、第
４番地の内容を読出し、読出したワードの下位１０ビツ
トを、行先ノード番号レジスタ１３２のビット１５から
ビット６に書込む（ステップ５１３）。As with the first read, the instruction code “+” in the read word and the relative destination node number <55> (-“110
111") are written into the lower 6 bits of the instruction code register 131 and the destination node number register 132, respectively (steps S5 and S6).
” bit is “1” (step S7), the read address is incremented (step 511), the contents of the fourth address are read, and the lower 10 bits of the read word are stored in bit 15 of the destination node number register 132. to bit 6 (step 513).

最後に読出したワードのＥＸＴビットの値が“ビである
ので（ステップ５１５）、現状の命令コードｒ＋Ｊ及び
行先ノード番号レジスタ１３２の内容＜　７０９５　＞
（−“０００１１０１１１０１１０１１１”）と入力し
たノード番号１１２との加算結果＜　７０９７　＞を出
力する（ステップＳ８）。Since the value of the EXT bit of the last read word is "BI" (step 515), the current instruction code r+J and the contents of the destination node number register 132 <7095>
(-“0001101110110111”) and the input node number 112, the addition result <7097> is output (step S8).

また、最後に読出したワードのｃｏｐｙビットの値は“
０”なので（ステップＳ９）、ノード番号〈２〉を有す
る入力パケフトに対する一連の処理を終了する。Also, the value of the copy bit of the last read word is “
0'' (step S9), the series of processing for the input packet lift having the node number <2> is completed.

上述のように、行先ノード番号を相対行先ノード番号で
与えた場合において、相対行先ノード番号が所定のビッ
ト幅を越えてオーバーフローしても、プログラムメモリ
の次の番地を相対行先ノード番号の拡張アドレス領域と
して用いることによって処理が完結され、どのような規
模の、またどのようなノード番号割付けが行われたプロ
グラムに対しても本発明が適用可能である。As mentioned above, when the destination node number is given as a relative destination node number, even if the relative destination node number overflows beyond the predetermined bit width, the next address in the program memory is set to the extended address of the relative destination node number. Processing is completed by using the area as an area, and the present invention is applicable to programs of any size and to which node numbers are assigned.

一般に、ノード間の接続は局所的であることから、プロ
グラム規模によらず、相対行先ノード番号が大きな値を
とることは稀であると考えられる。Generally, since connections between nodes are local, it is considered that the relative destination node number rarely takes a large value, regardless of the program scale.

実際に、３９８７ノードからなるテストプログラムに対
して、入力から順に近いノードに小さいノード番号を与
えるようにノード番号を割付けた場合に相対行先ノード
番号がどのような統計的分布を示すかを評価した。その
結果を第８図のグラフに示す。グラフの櫓軸は、相対行
先ノード番号を、線軸は累積度数の百分率を、各区間に
対する数字はその区間内に含まれるノード数をそれぞれ
示している。In fact, for a test program consisting of 3987 nodes, we evaluated the statistical distribution of relative destination node numbers when node numbers were assigned in such a way that nodes closest to the input were given smaller node numbers. . The results are shown in the graph of FIG. The turret axis of the graph shows the relative destination node number, the line axis shows the cumulative frequency percentage, and the number for each section shows the number of nodes included in that section.

この第８図のグラフからは、全体の約９割（８８χ）は
相対行先ノード番号が６３番地以内であることが容易に
理解される。即ち、相対行先ノード番号のビット幅を６
ビツトとしても、前述の如き拡張ピッ１−を必要とする
可能性は約１割であることを意味している。From the graph of FIG. 8, it is easily understood that about 90% (88χ) of the total have relative destination node numbers within address 63. In other words, the bit width of the relative destination node number is set to 6.
Even as a bit, this means that there is about a 10% possibility that the expansion pin 1- as described above will be necessary.

従来例のプログラムメモリのビット幅が２】ビット（命
令取り出し部４ビット＋行き先指定部１７ビノト）であ
るのに対して、本実施例の場合は１２ビツトであり、本
発明を実施した場合に拡張アドレスを格納するために余
分に必要となるワード数が」二連の統計から１０％稈度
あるとしても、メモリの規模は（１２ｘ１．１）／２１　＝　＝０．６３倍に圧縮され
る。本実施例ではプログラムメモリ空間を１６ビツトと
したが、この圧縮効果はプログラムメモリ空間が大きく
なると一層顕著になる。The bit width of the program memory in the conventional example is 2] bits (4 bits in the instruction fetch section + 17 bits in the destination specifying section), but in the case of this embodiment it is 12 bits, and when the present invention is implemented, Even if the number of extra words required to store the extended address is 10% from the double statistics, the memory size will be compressed by (12 x 1.1) / 21 = = 0.63 times. . In this embodiment, the program memory space is 16 bits, but this compression effect becomes more noticeable as the program memory space becomes larger.

例えば、プログラムメモリ空間を３２ビツトとした場合
には、相対行先ノード番号空間を８ビツトに拡大し、拡
張アドレスビットが必要となる割合を２０％と見積って
も、メモリ規模は（相対行先ノート番号→−フラグ＋命令コード）Ｘｌ、
２（行き先ノード番号十フラグ＋命令コード）となり、
実際のビット数を代入すると倍となり、半分以下に圧縮可能である。For example, if the program memory space is 32 bits, even if you expand the relative destination node number space to 8 bits and estimate that the proportion of extended address bits required is 20%, the memory size will be (relative destination node number →-flag + instruction code) Xl,
2 (destination node number 10 flag + instruction code),
Substituting the actual number of bits doubles it, making it possible to compress it to less than half.

なお、上記実施例では説明の簡便化のために、相対行先
ノード番号を正の数に限定したが、符号付き数（２の補
数表現）とすることによって、出力側のノード番号より
小さいノード番号をもつノドに対してデータを送出する
ことが可能になる。Note that in the above embodiment, the relative destination node number is limited to a positive number to simplify the explanation, but by using a signed number (two's complement representation), a node number smaller than the output side node number can be used. It becomes possible to send data to nodes with

これによって、繰り返し構造を含むプログラムに対して
も、本発明を適用できる。As a result, the present invention can be applied to programs including repetitive structures.

更に、本実施例では、相対行先ノード番号の桁溢れビッ
トを拡張ビットに割り当てたが、他の手法として、プロ
グラムをページ単位に分割し、ページ内番地を例えば６
ビ、トで表わし、ページを越えて行先ノードが存在する
場合に、相対ページ番号を拡張ビットに割当てる手法を
採ることも可能である。Furthermore, in this embodiment, the overflow bit of the relative destination node number is assigned to the extension bit, but another method is to divide the program into pages and set the address within the page to, for example, 6.
It is also possible to use a method of assigning a relative page number to an extension bit when there is a destination node beyond the page.

〔Effect of the invention〕

以上説明したように、本発明のデータ駆動形計算機にお
いては、従来例に示したデータ駆動形計算機の、命令数
山部と行先指定部とを−っのプログラムメモリに統合し
、このプログラムメモリを命令実行部と並列配置したこ
とにより、冗長な入力アドレスランチ及びデコーダ等を
省略してハードウェア規模を低減することができる。As explained above, in the data-driven computer of the present invention, the instruction number section and destination specification section of the data-driven computer shown in the conventional example are integrated into one program memory, and this program memory is By arranging it in parallel with the instruction execution section, redundant input address launches, decoders, etc. can be omitted, and the hardware scale can be reduced.

また、プログラムメモリと命令実行部を並列配置したこ
とにより、一つの命令を実行するための基本サイクルは
■マツチングメモリと■「グログラムメモリ＋命令実行
部」の二つのパイプライン段にて構成されるので、従来
例のデータ駆動形計算機に比べて遜色のない短い基本サ
イクルが得られており、単発の命令に対する応答時間も
短く抑えることができる。In addition, by arranging the program memory and instruction execution section in parallel, the basic cycle for executing one instruction consists of two pipeline stages: ■ matching memory and ■ "program memory + instruction execution section". Therefore, a short basic cycle comparable to that of conventional data-driven computers can be obtained, and the response time to a single command can also be kept short.

一般に、データ駆動形計算機は、並列分散処理に適した
計算モデルの上に構築されており、特別な制御を加えな
くてもマルチタスクが自律的に並列実行されるという際
だった特徴があると考えられている。しかしながら、マ
ルチタスクの各々を格納するプログラムメモリのアドレ
スが固定的に決められているような現状の方式の場合、
現在実行中のタスクＴａとプログラムメモリの領域を共
有する恐れのある他のタスクＴｂとは、プログラムメモ
リに空き領域があっても同時には実行できない。従って
、現状のデータ￥７Ａｒｐｊ＋形計算機は、マルチタス
クに適しているという本来有している特徴を充分に活用
することが出来ていないというのが実情である。In general, data-driven computers are built on a computational model suitable for parallel distributed processing, and have the distinctive feature of autonomously executing multiple tasks in parallel without any special control. It is considered. However, in the case of the current system where the address of the program memory that stores each multitask is fixed,
Other tasks Tb that may share the program memory area with the currently executing task Ta cannot be executed at the same time even if there is free space in the program memory. Therefore, the reality is that the current Data\7Arpj+ type computer cannot fully utilize its inherent feature of being suitable for multitasking.

これに対して、本発明のデータ駆動形計算機では、全て
のノードのノード番号が他のノードからの相対行先ノー
ド番号で指定されるので、プログラムメモリに空き領域
があれば絶対番地とは無関係に、任意の空き領域に新た
なタスクを外部からロードし、他のタスクと並列に実行
することができる。In contrast, in the data-driven computer of the present invention, the node number of every node is specified by the relative destination node number from other nodes, so if there is free space in the program memory, it is independent of the absolute address. , a new task can be externally loaded into any free space and executed in parallel with other tasks.

以上は、１プロセツサ上でのマルチタスク実行について
述べたものであるが、第９図に示すように、各プロセッ
サに跨る個々のタスクが、各プロセッサ内で同一のプロ
グラムメモリ領域を占めるようにすれば、ネットワーク
を介して相互にパケット通信を行うマルチプロセッサに
対しても、マルチタスク実行及びタスク生成を行わせる
ことが可能であり、複数のデータ駆動形計算機を用いた
高速並列分散処理を容易に実現することができる。The above describes multitasking execution on one processor, but as shown in Figure 9, it is possible to make the individual tasks spanning each processor occupy the same program memory area within each processor. For example, it is possible to have multiprocessors that communicate packets with each other via a network execute multitasks and generate tasks, making it easy to perform high-speed parallel distributed processing using multiple data-driven computers. It can be realized.

[Brief explanation of the drawing]

第１図は本発明によるデータ駆動形計算機の一実施例を
示すブロック図、第２図は本発明によるデータ駆動形計
算機のプログラムメモリのメモリ構成、より具体的には
第３図のプログラムを格納した場合のメモリ内容を示す
模式図、第３図は本発明のデータ駆動形計算機により実
行されるプログラム（データフローグラフ）の−例を示
す模式図、第４図は従来例のデータ駆動形計算機で実行
するプログラム（データフローグラフ）の−例を示す模
式図であり、ノード番号の割付けを除いて第３図と同様
のプログラムを示し、第５図は従来例のデータ駆動形計
算機の一例を示すブロック図、第６図（ａｌは従来例の
データ駆動形計算機における命令取出部のメモリ構成を
示す図、第６図（ｂｌは行先指定部のメモリ構成を示す
図模式図であり、いずれも第４図に示すプログラムを格
納した場合のメモリ内容を示し、第７図は本発明のデー
タ駆動形計算機のプログラムメモリにおける詳細な処理
手順を示すフローチャート、第８図は相対行先ノード番
号の値の分布を示すグラフ、第９図はマルチプロセッサ
構成におけるマルチタスクの実行状態におけるプログラ
ムメモリのマツプを示す概念図である。１０・・・マツチングメモリ　　１１・・・プログラム
メモすＩ２・・・命令実行部なお、図において、同一符号は、同一または相当部分を
示している。FIG. 1 is a block diagram showing an embodiment of a data-driven computer according to the present invention, and FIG. 2 is a memory configuration of a program memory of the data-driven computer according to the present invention, and more specifically, the program shown in FIG. 3 is stored. FIG. 3 is a schematic diagram showing an example of a program (data flow graph) executed by the data-driven computer of the present invention, and FIG. 4 is a diagram of a conventional data-driven computer. Fig. 5 is a schematic diagram showing an example of a program (data flow graph) to be executed in a conventional data-driven computer. 6 (al is a diagram showing the memory configuration of the instruction fetching section in a conventional data-driven computer, and FIG. 6 (bl is a schematic diagram showing the memory configuration of the destination specifying section. FIG. 4 shows the memory contents when the program shown in FIG. The graph showing the distribution, FIG. 9, is a conceptual diagram showing a map of program memory in the execution state of multitask in a multiprocessor configuration. 10...Matching memory 11...Program memo I2...Instruction execution In the figures, the same reference numerals indicate the same or corresponding parts.

Claims

[Claims]

(1) A match detection unit that detects two pieces of data that have matching destination node numbers attached to each data to be processed; An arithmetic processing unit performs arithmetic processing according to the instruction code, and the content is read out using the destination node number attached to the data processed by the arithmetic processing unit as an address, and the data is processed based on the read content. A program memory for updating a destination node number and an instruction code, and concurrently performing arithmetic processing on two pieces of data whose destination node numbers have been detected to match by the coincidence detection unit and updating the destination node number and instruction code. A data-driven computer, characterized in that the arithmetic processing section and the program memory are connected in parallel to the coincidence detection section for execution.