JPH0667879A

JPH0667879A - Pipeline processing computer

Info

Publication number: JPH0667879A
Application number: JP22164692A
Authority: JP
Inventors: Tatsuki Nakada; 達己中田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1992-08-20
Filing date: 1992-08-20
Publication date: 1994-03-11
Anticipated expiration: 2013-11-11
Also published as: JP2824484B2

Abstract

(57)【要約】【目的】１つの命令を、必要に応じて複数のフローに
分けたマルチフロー処理として実行するパイプライン処
理計算機に関し、マルチフロー処理における先行命令と
後続命令との間でのデータの相互依存関係をスコアボー
ドを用いて正しく維持して処理を行うことを可能とす
る。【構成】マルチフロー処理における複数のフローの最
終フローを検出する最終フロー検出手段１０と、検出さ
れた最終フローにおいて、前述のデータの相互依存関係
に対応するスコアボード１２の格納内容の更新を行わせ
るスコアボード更新制御手段１１とを備えるように構成
する。 (57) [Abstract] [Purpose] A pipeline processing computer that executes one instruction as multi-flow processing divided into a plurality of flows as necessary. It is possible to maintain the interdependence of data correctly by using a scoreboard. [Constitution] A final flow detecting means 10 for detecting a final flow of a plurality of flows in a multi-flow process, and a content stored in a scoreboard 12 corresponding to the above-described mutual dependency of data is updated in the detected final flow. And a scoreboard update control means 11 for controlling the scoreboard.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はデータ処理装置に係わ
り、さらに詳しくは先行して実行される命令と、その先
行命令に後続して実行される命令との間でのデータの依
存関係をチェックするスコアボードを備えたパイプライ
ン処理計算機に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data processing device, and more particularly to checking data dependency between a preceding instruction executed and an instruction executed subsequent to the preceding instruction. The present invention relates to a pipeline processing computer equipped with a scoreboard.

【０００２】パイプライン処理を行う計算機において
は、先行して実行される命令とその先行命令に続いて実
行される命令との間でのデータの依存関係、例えば最初
の命令による演算の結果を次の命令における演算データ
として用いるような関係がある場合には、その依存関係
を乱すことのない処理を行う必要がある。このように演
算に使用されるデータの相互依存関係を管理する制御方
法として、スコアボードを用いる方法がある。In a computer that executes pipeline processing, data dependence between an instruction executed first and an instruction executed subsequent to the preceding instruction, for example, the result of the operation by the first instruction is If there is a relationship such that it is used as operation data in the instruction, it is necessary to perform processing that does not disturb the dependency. As a control method for managing the mutual dependency relationship of data used for the calculation, there is a method using a scoreboard.

【０００３】この方法においては、汎用レジスタのそれ
ぞれに対応して、使用中／未使用を表す記憶手段として
のスコアボードが備えられている。汎用レジスタに書き
込みをする必要のある命令を実行する場合には、そのレ
ジスタ番号に対応するスコアボードのビットをセット
し、そのレジスタに書き込みが行われることを示してお
く。ある命令で結果が汎用レジスタに書き込まれた時に
は、そのレジスタ番号に対応するスコアボードのビット
をリセットし、書き込みが完了したことが示される。In this method, a scoreboard is provided as a storage means for indicating whether the register is in use or not, corresponding to each general purpose register. When executing an instruction that requires writing to a general-purpose register, the bit on the scoreboard corresponding to that register number is set to indicate that writing to that register is performed. When the result is written to a general register by an instruction, the scoreboard bit corresponding to that register number is reset, indicating that the write is complete.

【０００４】後続する命令においては、先行して実行さ
れている命令でレジスタの書き換えが行われるかどうか
を確認するために、演算データを格納するレジスタの番
号に対応するスコアボードのビットを読み出し、それが
セットされている場合には先行して実行中の命令での書
き換えが完了していないことになるため、後続命令の、
演算の開始を待つ必要がある。スコアボードのビットが
リセットされていたら、先行して実行中の命令はレジス
タの内容を書き換えることはない、すなわち後続の命令
が演算の開始を待った結果、先行して実行中の命令での
レジスタの書き換えが完了していることになるので、後
続命令の演算を開始するとができる。In the subsequent instruction, in order to confirm whether or not the register is rewritten by the instruction that is executed in advance, the scoreboard bit corresponding to the number of the register that stores the operation data is read out, If it is set, it means that the rewriting by the preceding executing instruction has not been completed.
It is necessary to wait for the start of calculation. If the scoreboard bit is reset, the preceding executing instruction does not rewrite the contents of the register, i.e., the succeeding instruction waits for the operation to start, resulting in the register executing the preceding executing instruction. Since the rewriting has been completed, the calculation of the subsequent instruction can be started.

【０００５】スコアボードを用いる処理の必要性を図
７、および図８を用いて説明する。図７は先行命令での
レジスタへの書き込み完了を待たなかった場合に、デー
タの相互関係を正しく処理できない場合があることを示
している。同図において、第１の乗算命令（ｍｕｌｔ）
では、レジスタ２の内容（ｇｒ２）とレジスタ３の内容
（ｇｒ３）とを掛け合わせ、その結果をレジスタ４に格
納する命令を示し、第２の命令としての加算命令（ａｄ
ｄ）では、レジスタ５の内容とレジスタ６の内容を加算
して、レジスタ４に格納する命令を示し、また少し位置
の離れた第３の命令としての加算命令では、第２の命令
としての加算命令の結果が格納されたレジスタ４の内容
とレジスタ７の内容とを加算して、レジスタ８に格納す
る命令を示している。この場合スコアボードを用いるこ
となく処理を実行すると、図に示すように例えば第１の
命令としての乗算命令では演算実行（Ｅ）ステージが３
ステージかかるために、第３の命令としての加算命令の
処理においては、第１の命令としての乗算命令の結果が
格納されたレジスタ４の内容が読み出されて加算が実行
されてしまい、間違った結果が得られることになる。The necessity of processing using the scoreboard will be described with reference to FIGS. 7 and 8. FIG. 7 shows that the mutual relationship of data may not be processed correctly when the completion of writing to the register by the preceding instruction is not waited. In the figure, the first multiplication instruction (multi)
Then, an instruction for multiplying the contents of the register 2 (gr2) and the contents of the register 3 (gr3) and storing the result in the register 4 is shown, and the addition instruction (ad
In d), the contents of the register 5 and the contents of the register 6 are added, and the instruction to store in the register 4 is shown. Also, in the addition instruction as the third instruction, which is a little distant, the addition as the second instruction is performed. The instruction to store the result of the instruction in the register 8 by adding the content of the register 4 and the content of the register 7 is shown. In this case, if the processing is executed without using the scoreboard, as shown in the figure, for example, in the multiplication instruction as the first instruction, the operation execution (E) stage has three stages.
Because of the stage, in the processing of the addition instruction as the third instruction, the contents of the register 4 in which the result of the multiplication instruction as the first instruction is stored is read and the addition is executed, which is incorrect. The result will be obtained.

【０００６】図８は先行命令でのレジスタへのデータ書
き込み完了を後続命令が待つ場合の処理を示している。
同図においては、第１の命令として乗算命令の最初の実
行（Ｅ）ステージでレジスタ４に対するスコアボードビ
ットを‘１’とすることによって、その結果がレジスタ
４に書き込まれ、スコアボードビットが‘０’になるま
で、すなわちリセットされるまで第２の加算命令の実行
はインタロックされており、第２の加算命令の処理にお
いて正しいデータを読み出した処理が行われることを示
している。FIG. 8 shows the processing when the subsequent instruction waits for the completion of data writing to the register by the preceding instruction.
In the figure, by setting the scoreboard bit for register 4 to "1" at the first execution (E) stage of the multiplication instruction as the first instruction, the result is written in register 4 and the scoreboard bit is set to "1". The execution of the second addition instruction is interlocked until it becomes 0 ′, that is, until it is reset, which indicates that the processing of reading the correct data is performed in the processing of the second addition instruction.

【０００７】なおスコアボードにビットをセットするこ
とは、レジスタの値を読み出せない（書き込めない）と
いうことを示している。加算命令のような基本命令の多
くの実行ステージは一般に１ステージのみであり、かつ
バイパス機能によって実行ステージで生成された結果は
データとして読み出すことができる。１ステージで実行
ステージが終了する命令では、図７に示したように後続
の命令に追い越されて誤った動作をする可能性は存在し
ない。そこで加算命令のような基本命令、すなわち１フ
ローで実行でき、かつ１ステージしか実行にかからな
い、最も実行時間の少ない命令では、スコアボードのセ
ット／リセットを行わなくてもデータの相互干渉を防止
することはできる。Setting a bit on the scoreboard means that the value of the register cannot be read (cannot be written). Many execution stages of basic instructions such as add instructions generally have only one stage, and the result generated in the execution stage by the bypass function can be read as data. There is no possibility that an instruction whose execution stage ends in one stage will be overtaken by a succeeding instruction to perform an erroneous operation, as shown in FIG. Therefore, basic instructions such as addition instructions, that is, instructions that can be executed in one flow and take only one stage and have the shortest execution time, prevent mutual interference of data without setting / resetting the scoreboard. You can

【０００８】図９はスコアボードを使用したパイプライ
ン計算機の従来例の構成ブロック図である。同図におい
て、パイプライン計算機は命令コードを解読する命令デ
コーダ１、演算用データおよび演算結果を格納する汎用
レジスタ（ＧＲ）２、例えば加算器としての演算器３、
例えばシフターとしての演算器４、演算器３に対する演
算用データ、すなわちオペランドを保持するオペランド
レジスタ３ａ，３ｂ，演算器４に対するオペランドを保
持するオペランドレジスタ４ａ，４ｂ，スコアボード
５，パイプライン動作を制御するパイプラインコントロ
ーラ６、パイプラインコントローラ６の制御の基にパイ
プラインタグを保持するパイプラインタグレジスタ７、
およびアンド回路８から構成されている。FIG. 9 is a block diagram of a conventional example of a pipeline computer using a scoreboard. In the figure, a pipeline computer includes an instruction decoder 1 for decoding an instruction code, a general purpose register (GR) 2 for storing operation data and an operation result, for example, an operation unit 3 as an adder,
For example, the operation unit 4 as a shifter, the operation data for the operation unit 3, that is, the operand registers 3a and 3b holding the operand, the operand registers 4a and 4b holding the operand for the operation unit 4, the scoreboard 5, and the pipeline operation are controlled. A pipeline controller 6, a pipeline tag register 7 for holding a pipeline tag under the control of the pipeline controller 6,
And an AND circuit 8.

【０００９】図９において、命令デコーダ１によって命
令コードが解読され、読み出しレジスタ番号としてソー
スレジスタアドレス０、およびソースレジスタアドレス
１が、また書き込みレジスタ番号としてディスティネー
ションレジスタアドレスが得られ、読み出しレジスタ番
号はＧＲ２の読み出しレジスタポートに入力され、ソー
スオペランドとしてのとしてのリードデータ０とリード
データ１が各オペランドレジスタ３ａ，３ｂ，４ａ，お
よび４ｂに保持される。これらのオペランドレジスタの
出力は演算器３および４に入力され、演算結果はライト
データとしてＧＲ２の書き込みデータポートに入力され
る。この演算結果と同時に、書き込みレジスタ番号とし
てのディスティネーションレジスタアドレスが、パイプ
ラインタグレジスタ７によってタイミングが合わせられ
て、書き込みレジスタアドレスとしてＧＲ２の書き込み
レジスタポートに入力される。書き込み制御信号も命令
デコーダ１によって生成され、パイプラインタグレジス
タ７によってタイミングが合わされてＧＲ２に入力され
るが、この制御信号については図示していない。In FIG. 9, the instruction code is decoded by the instruction decoder 1, the source register address 0 and the source register address 1 are obtained as the read register number, and the destination register address is obtained as the write register number. Read data 0 and read data 1 as source operands are input to the read register port of GR2 and held in the respective operand registers 3a, 3b, 4a, and 4b. The outputs of these operand registers are input to the arithmetic units 3 and 4, and the arithmetic results are input to the write data port of GR2 as write data. At the same time as this operation result, the destination register address as the write register number is input to the write register port of GR2 as the write register address with the timing adjusted by the pipeline tag register 7. The write control signal is also generated by the instruction decoder 1 and input to the GR 2 at a timing adjusted by the pipeline tag register 7, but this control signal is not shown.

【００１０】以上はデータの相互干渉がない場合の動作
であるが、データの相互干渉がある場合には、命令デコ
ーダ１から出力される読み出しレジスタ番号、および書
き込みレジスタ番号がスコアボード５の読み出しレジス
タ検査ポート（ＲＤ０ＣＨＫ，ＲＤ１ＣＨＫ）およ
び書き込みレジスタ検査ポート（ＷＲＣＨＫ）に入力
され、これらのレジスタ番号はそれぞれがスコアボード
５内の使用中／未使用を表すレジスタを選択するための
選択信号とし用いられる。選択されたレジスタからの信
号ＳＲＣ０ＲＥＧＢＵＳＹ，ＳＲＣ１ＲＥＧＢ
ＵＳＹ，およびＷＲＲＥＧＢＵＳＹはパイプライン
コントローラ６に入力され、図８で示したようにデコー
ドステージをインタロックするかどうかを決定するため
に用いられる。The above is the operation when there is no mutual interference of data, but when there is mutual interference of data, the read register number and the write register number output from the instruction decoder 1 are the read registers of the scoreboard 5. It is input to the check port (RD0 CHK, RD1 CHK) and the write register check port (WR CHK), and these register numbers are used as selection signals for selecting the registers in the scoreboard 5 that indicate busy / unused. Used. Signals from selected registers SRC0 REG BUSY, SRC1 REG B
USY and WR REG BUSY are input to the pipeline controller 6 and are used to determine whether to interlock the decode stage as shown in FIG.

【００１１】パイプラインコントローラ６への前述の３
つの信号のいずれかがスコアボードビットが‘１’であ
ることを示していた場合には、Ｄステージリリース信
号、すなわちデコードステージを完了して実行ステージ
に進んでよいことを示す信号の出力は抑制される。The above-mentioned 3 to the pipeline controller 6
If any one of the two signals indicates that the scoreboard bit is '1', the output of the D stage release signal, that is, the signal indicating that the decode stage may be completed and the execution stage may be performed is suppressed. To be done.

【００１２】先行する命令において、例えばレジスタへ
の書き込みが完了し、スコアボード５内の使用中／未使
用を表すレジスタのリセットが行われ、パイプラインコ
ントローラ６への前述の３つの信号がいずれもスコアボ
ードビットがセットされていないことを示すようになっ
た場合には、Ｄステージ信号が４つのオペランドレジス
タに出力されると共に、その信号がパイプラインタグレ
ジスタ７内のＤフリップフロップのクロックイネーブル
端子に、またスコアボード５の前にあるアンド回路８に
与えられる。In the preceding instruction, for example, the writing to the register is completed, the register in the scoreboard 5 indicating the busy / unused state is reset, and the above-mentioned three signals to the pipeline controller 6 are all set. When the scoreboard bit comes to indicate that it is not set, the D stage signal is output to the four operand registers, and the signal is sent to the clock enable terminal of the D flip-flop in the pipeline tag register 7. To the AND circuit 8 in front of the scoreboard 5.

【００１３】これによって後続命令に対する実行ステー
ジの処理が行われるが、この時アンド回路８のもう一方
の入力として与えられている命令デコーダ１からのディ
スティネーションレジスタアドレスがスコアボード５の
ＷＲＳＥＴ端子に与えられ、書き込みレジスタ番号に
対応したスコアボードのビットがセットされる。更に、
パイプラインコントローラ６から出力されるＥステージ
リリース信号がパイプラインタグレジスタ７に入力され
た時点で、ディスティネーションレジスタアドレスはラ
イトレジスタアドレスとしてスコアボードのＷＲＲＥ
Ｓ端子に与えられ、対応するスコアボードビットのリセ
ットが行われると共に、前述のようにこのライトレジス
タアドレスはＧＲ２に与えられる。As a result, the execution stage processing for the subsequent instruction is performed. At this time, the destination register address from the instruction decoder 1 provided as the other input of the AND circuit 8 is input to the WR SET terminal of the scoreboard 5. The bit of the scoreboard that is given and corresponds to the write register number is set. Furthermore,
When the E-stage release signal output from the pipeline controller 6 is input to the pipeline tag register 7, the destination register address is used as a write register address of the scoreboard WR RE.
This write register address is applied to GR2 as described above while being applied to the S terminal to reset the corresponding scoreboard bit.

【００１４】[0014]

【発明が解決しようとする課題】発明が解決しようとす
る課題を説明する前に、まずマルチフロー処理と、命令
の並列実行時におけるスコアボードの使用について説明
する。マルチフロー処理はステージ展開処理とも呼ば
れ、１つの命令をあたかも複数の命令であるかのように
複数のフローに分けて実行する処理である。Before describing the problems to be solved by the present invention, the multiflow processing and the use of the scoreboard at the time of parallel execution of instructions will be described first. The multi-flow process is also called a stage expansion process, and is a process in which one instruction is divided into a plurality of flows and executed as if it were a plurality of instructions.

【００１５】例えばペアとしての２つの４バイトの汎用
レジスタ（ＧＲ）の値を８バイトの浮動小数点レジスタ
（ＦＲ）に転送する命令を実行する場合に、汎用レジス
タ側で１つのポートだけを使用して１度に４バイトずつ
読み出し、それを浮動小数点レジスタに転送する処理を
２回行うような処理がマルチフロー処理である。図１０
はこの処理の例であり、ｒ１，ｒ１＋１で示されるレジ
スタペアの格納内容を２回のフローに分けて転送してい
る様子を示している。For example, when executing an instruction to transfer the value of two 4-byte general purpose registers (GR) as a pair to an 8-byte floating point register (FR), only one port is used on the side of the general purpose registers. The multi-flow process is a process in which four bytes are read at a time and transferred to the floating point register twice. Figure 10
Is an example of this processing, and shows that the stored contents of the register pair indicated by r1 and r1 + 1 are transferred in two divided flows.

【００１６】図１１は乗算命令に対するマルチフロー処
理の例である。一般に乗算命令の実行頻度は少なく、そ
の性能が低下してもシステム全体の性能に与える影響は
小さい。従って図１１においては、乗算用の２つの演算
データが２回のフローに分けて乗算器に転送され、これ
によって演算データを転送するためのデータバスのバス
幅の減少が図られている。FIG. 11 shows an example of multiflow processing for a multiplication instruction. Generally, the multiplication instruction is executed less frequently, and even if the performance is lowered, the influence on the performance of the entire system is small. Therefore, in FIG. 11, two pieces of operation data for multiplication are transferred to the multiplier in two divided flows, whereby the bus width of the data bus for transferring the operation data is reduced.

【００１７】しかしながらこのようなマルチフロー処理
を行う乗算命令において、スコアボードのセットを行っ
た場合にハングアップ状態となる例が図１２に示されて
いる。図１２においては、レジスタ２の内容とレジスタ
３の内容とを掛けて、その結果をレジスタ３に格納する
乗算命令を２つのフローに分けて実行する様子が示され
ている。まず第１のフローにおいてはレジスタ２の内容
が乗算器に送られるが、乗算結果をレジスタ３に格納す
るために、第１のフローの実行ステージにおいてレジス
タ３に対応するスコアボードビットがセットされる。こ
のため第２のフローでレジスタ３の内容を読み出そうと
しても、スコアボードのビットがセットされているため
に読み出しを行うことはできず、インターロックの状態
となる。このスコアボードビットはレジスタ３への書き
込みでリセットされるが、第２のフローの実行ステージ
の結果が出るまではリセットされず、第２のフローがデ
コードステージでインターロックしているために永久に
リセットができず、ハングアップ状態となってしまうこ
とになる。このようなハングアップ状態はハード障害状
態であり、デッドロックとも呼ばれ、永遠に解消されな
い状態である。これに対して図８で示したインターロッ
ク状態は、スコアボードビットのリセットによって処理
が再開されるものである。However, FIG. 12 shows an example of a hang-up state when a scoreboard is set in a multiplication instruction for performing such multi-flow processing. FIG. 12 shows a state in which the content of register 2 is multiplied by the content of register 3 and the multiplication instruction for storing the result in register 3 is divided into two flows and executed. First, in the first flow, the contents of register 2 are sent to the multiplier, but in order to store the multiplication result in register 3, the scoreboard bit corresponding to register 3 is set in the execution stage of the first flow. . Therefore, even if an attempt is made to read the contents of the register 3 in the second flow, the bit cannot be read because the bit of the scoreboard is set, and the state becomes the interlock. This scoreboard bit is reset by writing to register 3, but it is not reset until the result of the execution stage of the second flow is obtained, and it is permanently set because the second flow is interlocked in the decode stage. It cannot be reset and will be in a hang-up state. Such a hang-up state is a hard failure state, which is also called a deadlock, and is a state in which it cannot be eliminated forever. On the other hand, in the interlock state shown in FIG. 8, the process is restarted by resetting the scoreboard bit.

【００１８】次に命令の並列実行時における問題点を説
明する。図１３はその問題点の説明図である。この例に
おいては加算命令と乗算命令が並列実行されるが、加算
命令ではレジスタ１の内容とレジスタ２の内容が加算さ
れてレジスタ３に格納され、また乗算命令においてはレ
ジスタ２の内容とレジスタ３との内容との乗算が行わ
れ、その結果はレジスタ３に格納される。これらの命令
のうち加算命令は最初の３つのステージで終了し、一方
乗算命令は２つのフローに分けられ、すなわちマルチフ
ロー処理として行われる。Next, problems in parallel execution of instructions will be described. FIG. 13 is an explanatory diagram of the problem. In this example, the addition instruction and the multiplication instruction are executed in parallel. In the addition instruction, the contents of register 1 and the contents of register 2 are added and stored in register 3, and in the multiplication instruction, the contents of register 2 and register 3 are added. The contents of and are multiplied, and the result is stored in the register 3. Of these instructions, the add instruction finishes in the first three stages, while the multiply instruction is split into two flows, ie performed as a multi-flow process.

【００１９】図１２で説明した問題点を避けるために、
例えばスコアボードビットを乗算命令の第２フローにお
いてレジスタ３の内容を読み込む時に立てるとしても、
この時には加算命令の結果がレジスタ３に格納されてお
り、乗算命令の処理はこの加算結果を用いて行われてし
まうことになる。このように命令の並列実行に対して
は、データの相互依存関係を考慮してスコアボードビッ
トを立てると共に、最も実行時間の少ない命令としての
基本命令の実行はデータの相互依存関係を考慮して遅延
させることが必要になるという問題点がある。In order to avoid the problem described with reference to FIG.
For example, even if the scoreboard bit is set when reading the contents of the register 3 in the second flow of the multiplication instruction,
At this time, the result of the addition instruction is stored in the register 3, and the processing of the multiplication instruction is performed using this addition result. In this way, for parallel execution of instructions, the scoreboard bit is set in consideration of the interdependence of data, and the execution of the basic instruction as the instruction with the shortest execution time considers the interdependence of data. There is a problem that it is necessary to delay.

【００２０】本発明は、マルチフロー処理を必要とする
命令の実行時にスコアボードによるデータの相互依存関
係を正しく維持して処理を行うことと、単一のフローで
実行できる命令とマルチフロー処理を必要とする命令と
の並列実行時におけるデータの相互依存関係を損なうこ
となく正しい処理を実行することとを可能とすることで
ある。According to the present invention, when an instruction which requires multi-flow processing is executed, processing is performed while correctly maintaining the interdependence of data by the scoreboard, and an instruction and multi-flow processing which can be executed by a single flow. That is, it is possible to execute a correct process without impairing the interdependency of data in parallel execution with a required instruction.

【００２１】[0021]

【課題を解決するための手段】図１は本発明の原理ブロ
ック図である。同図は、先行して実行される命令と、そ
の先行命令に引き続いて実行される命令との間でのデー
タの依存関係をチェックするスコアボードを備え、かつ
１つの命令を複数のフローに分けて実行することができ
るパイプライン処理計算機の原理ブロック図である。FIG. 1 is a block diagram showing the principle of the present invention. The figure includes a scoreboard that checks the data dependency between the instruction that is executed first and the instruction that is executed subsequently to that instruction, and divides one instruction into multiple flows. It is a principle block diagram of a pipeline processing computer that can be executed by.

【００２２】図１において、最終フロー検出手段１０
は、例えばマルチフローコントローラであり、１つの命
令を複数のフローに分けて実行する処理としてのマルチ
フロー処理の最終フローを検出するものである。またス
コアボード更新制御手段１２は、例えばスコアボード１
１の入力側に設けられる２つのアンド回路であり、マル
チフロー処理の最終フローにおいて、前述のデータ依存
関係に対応するスコアボード１１の格納内容の更新を行
わせるものである。In FIG. 1, the final flow detecting means 10
Is, for example, a multi-flow controller, which detects the final flow of the multi-flow process as a process in which one instruction is divided into a plurality of flows and executed. Further, the scoreboard update control means 12 is, for example, the scoreboard 1
The two AND circuits are provided on the input side of 1 to update the stored contents of the scoreboard 11 corresponding to the above-mentioned data dependency in the final flow of the multi-flow process.

【００２３】[0023]

【作用】本発明においては、１つの命令を複数のフロー
に分けて実行する処理、すなわちマルチフロー処理の最
終フローでのみスコアボードの格納内容更新処理、すな
わちセット／リセット処理が行われる。図２を用いて本
発明の作用を説明する。According to the present invention, the stored contents update processing of the scoreboard, that is, the set / reset processing is performed only in the processing for executing one instruction by dividing it into a plurality of flows, that is, in the final flow of the multiflow processing. The operation of the present invention will be described with reference to FIG.

【００２４】図１で説明したように、本発明においては
最終フロー検出手段１０がマルチフロー処理の最終フロ
ーを検出した時に、例えばそのフローの命令が実行され
た結果が書き込まれるレジスタに対応するスコアボード
のビットがセットされる。このセットはスコアボード更
新制御手段１２によって行われる。図２においては、図
１２におけると同じ命令が処理されるが、図１２におい
ては第１のフローにおいてレジスタ３に乗算結果が格納
されると判明した時点で第１のフローの実行ステージに
おいてスコアボードビットが‘１’とされたが図２１に
おいては第２のフロー、すなわち最終フローの第１の実
行ステージにおいてスコアボードのセットが行われる。
これによってレジスタ３の内容は第２のフローの第１の
実行ステージにおいて汎用レジスタＧＲに転送され、そ
の後乗算処理が実行されて乗算結果がレジスタ３に格納
される時点でスコアボードビットが‘０’にリセットさ
れる。As described with reference to FIG. 1, in the present invention, when the final flow detecting means 10 detects the final flow of the multiflow processing, for example, the score corresponding to the register in which the result of executing the instruction of the flow is written. The board bit is set. This setting is performed by the scoreboard update control means 12. In FIG. 2, the same instruction as in FIG. 12 is processed, but in FIG. 12, when it is found that the multiplication result is stored in the register 3 in the first flow, the scoreboard is executed in the execution stage of the first flow. Although the bit is set to "1", the scoreboard is set in the second flow in FIG. 21, that is, the first execution stage of the final flow.
As a result, the contents of the register 3 are transferred to the general-purpose register GR in the first execution stage of the second flow, and then the multiplication process is executed and the multiplication result is stored in the register 3, and the scoreboard bit is "0". Is reset to.

【００２５】以上のように、本発明においては、マルチ
フロー処理の最終フローでのみスコアボードのセット／
リセット処理が行われる。その結果、先行命令と後続命
令との間でのデータの依存関係を乱すことのない処理
が、パイプラインのインタロックを含む処理によって保
証される。As described above, in the present invention, the scoreboard setting / setting is performed only in the final flow of the multiflow processing.
Reset processing is performed. As a result, processing that does not disturb the data dependency between the preceding instruction and the subsequent instruction is guaranteed by the processing including the pipeline interlock.

【００２６】またマルチフロー処理を必要とする命令と
必要としない命令との同時実行時には、スコアボードの
セット／リセット処理はマルチフロー処理の最終フロー
で行われると共に、マルチフロー処理を必要としない命
令の実行もマルチフロー処理の最終フローまで遅延させ
られる。When an instruction that requires multi-flow processing and an instruction that does not need multi-flow processing are executed simultaneously, the set / reset processing of the scoreboard is performed in the final flow of the multi-flow processing, and the instruction that does not require multi-flow processing. Is also delayed until the final flow of the multiflow process.

【００２７】[0027]

【実施例】図３は本発明のパイプライン処理計算機の全
体構成ブロック図である。同図において図９の従来例と
同じ部分には符号を付してある。図３において、図９の
従来例と異なる点はマルチフロー処理における最終フロ
ーを検出するためのマルチフローコントローラ２０が追
加され、またスコアボード５の入力側にアンド回路８の
代わりにアンド回路２１が、また新たにアンド回路２２
が設けられていることである。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 3 is a block diagram showing the overall configuration of a pipeline processing computer of the present invention. In the figure, the same parts as those in the conventional example of FIG. 3 is different from the conventional example of FIG. 9 in that a multiflow controller 20 for detecting the final flow in the multiflow processing is added, and an AND circuit 21 is provided on the input side of the scoreboard 5 instead of the AND circuit 8. , And AND circuit 22
Is provided.

【００２８】マルチフローコントローラ２２には、命令
デコーダ１からマルチフローが２つのフローに分かれて
いることを示すツーフローオペレーション、または３つ
のフローに分かれていることを示すスリーフローオペレ
ーション信号が入力され、またパイプラインコントロー
ラ６からのＤステージリリース信号入力される。そして
マルチフローコントローラ２０からのラストフロー検出
信号は２つのアンド回路２１，２２に入力され、またフ
ローカウンタ信号は命令デコーダ１に出力される。アン
ド回路２１にはラストフロー検出信号と共に、図９にお
けると同様に書き込みレジスタ番号を示すディスティネ
ーションレジスタアドレスとＤステージリリース信号が
入力されており、これらの信号が揃った時点でスコアボ
ード５にライトレジスタセット入力が与えられる。また
アンド回路２２には、ラストフロー検出信号と共に、パ
イプラインタグレジスタ７からのライトレジスタアドレ
スが与えられており、これらの信号が揃ったところでス
コアボード５にライトレジスタリセット信号が入力され
る。To the multiflow controller 22, a two-flow operation signal indicating that the multiflow is divided into two flows or a threeflow operation signal indicating that the multiflow is divided into three flows is input from the instruction decoder 1. Further, a D stage release signal is input from the pipeline controller 6. The last flow detection signal from the multiflow controller 20 is input to the two AND circuits 21 and 22, and the flow counter signal is output to the instruction decoder 1. In addition to the last flow detection signal, the destination register address indicating the write register number and the D stage release signal are input to the AND circuit 21 as well as in FIG. Register set inputs are provided. Further, the AND circuit 22 is given the write register address from the pipeline tag register 7 together with the last flow detection signal, and when these signals are aligned, the write register reset signal is input to the scoreboard 5.

【００２９】図３における各ステージの処理の概要を説
明する。まずデコード（Ｄ）ステージでは命令をデコー
ドしてマルチフロー命令であること、命令の実行に必要
なフロ数、スコアボードの更新を必要とする命令である
ことをデコードする。デコードした結果及びマルチフロ
ーカウンタの値等をパイプライン・タグに保持し、以後
のステージで使用する。An outline of the processing of each stage in FIG. 3 will be described. First, in the decode (D) stage, an instruction is decoded to decode that it is a multi-flow instruction, the number of flows required to execute the instruction, and the instruction that requires updating of the scoreboard. The decoded result and the value of the multiflow counter are held in the pipeline tag and used in the subsequent stages.

【００３０】書き込みレジスタ番号に対応するスコアボ
ードのビット数を検査し、もしセットされていたら命令
デコードステージでインターロックする。読み出しレジ
スタ番号に対応するスコアボードのビット数を検査し、
もしセットされていたら命令デコードステージでインタ
ーロックする。The number of bits on the scoreboard corresponding to the write register number is checked, and if set, interlocked at the instruction decode stage. Check the number of bits on the scoreboard corresponding to the read register number,
If set, interlock at the instruction decode stage.

【００３１】マルチフロー命令の場合に現在何フロー目
の処理をしているかを示すマルチフローカウンタの出力
は、マルチフロー命令でない場合は常に１フロー目を実
行しているのと等価な値を示している。また最終フロー
であることを検出するために、マルチフローカウンタの
値と命令の実行に必要なフロー数を比較する。マルチフ
ロー命令でない場合はマルチフローカウンタが常に１フ
ロー目を実行しているのと等価な値を示しているので最
終フローであることが検出される。In the case of a multi-flow instruction, the output of the multi-flow counter, which indicates which flow is currently being processed, shows a value equivalent to the fact that the first flow is always executed when it is not a multi-flow instruction. ing. In order to detect the final flow, the value of the multiflow counter is compared with the number of flows required to execute the instruction. If it is not a multiflow instruction, the multiflow counter always shows a value equivalent to that of executing the first flow, so that it is detected that it is the final flow.

【００３２】演算実行（Ｅ）ステージでは演算実行以外
に、演算ステージを実行中の命令がスコアボードの更新
を必要とする命令であり、かつ最終フローを実行してい
たら、書き込みレジスタ番号に対応するスコアボードの
ビットをセットする。一部の命令ではこのステージでレ
ジスタ読み出しを行う場合もある。In the operation execution (E) stage, in addition to the operation execution, if the instruction executing the operation stage is an instruction that requires the update of the scoreboard and the final flow is being executed, it corresponds to the write register number. Set the scoreboard bit. Some instructions may read registers at this stage.

【００３３】書き込み（Ｗ）ステージでは演算ステージ
で得られた結果をレジスタに書き込むなどの処理以外
に、書き込み（Ｗ）ステージを実行中の命令がスコアボ
ードの更新を必要とする命令であり、かつ最終フローを
実行していたら、書き込みレジスタ番号に対応するスコ
アボードのビットをリセットする。In the write (W) stage, in addition to processing such as writing the result obtained in the operation stage to the register, the instruction executing the write (W) stage is an instruction requiring updating of the scoreboard, and If the final flow has been executed, the bit on the scoreboard corresponding to the write register number is reset.

【００３４】図４はマルチコントローラの実施例構成ブ
ロック図である。同図において、マルチコントローラ２
０は命令デコーダ１からのツーフローオペレーション、
及びスリーフローオペレーション信号が入力されるオア
回路２３、オア回路２３の出力が与えられるインバータ
２４、パイプラインコントローラ６が出力するＤステー
ジリリース信号とクロック信号とが入力されるアンド回
路２５、フローカウンタ２６、フローカウンタ２６の出
力とツーフローオペレーション信号が入力されるアンド
回路２７、フローカウンタ２６の出力とスリーフローオ
ペレーション信号が入力されるアンド回路２８、インバ
ータ２４、アンド回路２７、及び２８の出力が入力され
るオア回路２９、及びオア回路２９の出力を反転するイ
ンバータ３０から構成されている。FIG. 4 is a block diagram of an embodiment of the multi-controller. In the figure, the multi-controller 2
0 is a two-flow operation from the instruction decoder 1,
An OR circuit 23 to which the three-flow operation signal is input, an inverter 24 to which the output of the OR circuit 23 is given, an AND circuit 25 to which the D stage release signal and the clock signal output from the pipeline controller 6 are input, and a flow counter 26. An AND circuit 27 to which the output of the flow counter 26 and the two-flow operation signal are input, an AND circuit 28 to which the output of the flow counter 26 and the three-flow operation signal are input, the outputs of the inverter 24, the AND circuits 27 and 28 are input. The OR circuit 29 and the inverter 30 that inverts the output of the OR circuit 29.

【００３５】図３において命令デコーダでは実行する命
令がいくつのフローから構成されるかをデコードする。
またマルチフロー命令の場合はＦＬＯＷＣＯＵＮＴＥ
Ｒの値によって現在実行している命令のフロー番号を教
えており、最終フローであることを検出し（ＬＡＳＴ
ＦＬＯＷ信号）、最終フローであった場合にはプログラ
ムカウンタの更新制御や命令バッファ（キュウ）の中か
らの命令の選択を行う。In FIG. 3, the instruction decoder decodes the number of flows of the instruction to be executed.
In case of multi-flow instruction, FLOW COUNTE
The flow number of the instruction currently being executed is taught by the value of R, and it is detected that it is the final flow (LAST
FLOW signal), if it is the final flow, update control of the program counter and selection of an instruction from the instruction buffer (kyu) are performed.

【００３６】命令デコーダはＦＬＯＷＣＯＵＮＴＥＲ
の値を使って、フローによって同じ命令でもデコードの
結果を一部変更する。例えば図１０で説明したＧＲから
ＦＲへの転送では、読み出しレジスタ番号を１フロー目
と２フロー目で異なるレジスタ番号を与えて、２つのレ
ジスタを読み出している。The instruction decoder is a FLOW COUNTER
Depending on the flow, the same instruction may be used to partially change the decoding result. For example, in the transfer from GR to FR described in FIG. 10, two register numbers are read by giving different read register numbers to the first flow and the second flow.

【００３７】図４において、マルチフロー処理でない単
一のフローで処理される命令に対しては、命令デコーダ
１からツーフローオペレーション、及びスリーフローオ
ペレーション信号は出力されず、オア回路２３の出力は
‘０’、インバータ２４の出力が‘１’となり、オア回
路２９の出力によってそのフローはラストフローである
ことが示される。In FIG. 4, the two-flow operation and three-flow operation signals are not output from the instruction decoder 1 for an instruction processed by a single flow that is not a multi-flow processing, and the output of the OR circuit 23 is'. 0 ', the output of the inverter 24 becomes'1', and the output of the OR circuit 29 indicates that the flow is the last flow.

【００３８】これに対してマルチフロー処理が２つのフ
ローから成る場合には、命令デコーダ１からツーフロー
オペレーション信号がアンド回路２７の最も上の入力端
子に与えられる。そこでアンド回路２７の出力は、上か
ら２番目の入力端子に‘１’、３番目の入力端子に
‘０’が与えられた時に‘１’となり、オア回路２９の
出力はラストフローを示すことになる。すなわちこの時
フローカウンタの出力は‘０１’である。アンド回路２
５に対しては、クロック信号と共にパイプラインコント
ローラ６からのＤステージリリース信号が入力されてお
り、クロック信号の立ち上がり時にＤステージリリース
信号が与えられていればフローカウンタ２６の出力がイ
ンクリメントされる。マルチフロー処理におけるフロー
が‘２’の場合には、最初のフローに対してはフローカ
ウンタ２６の出力は‘０’となっており、第２のフロー
のＤステージが完了したことを示すＤステージリリース
信号の入力時にフローカウンタ２６の出力が‘１’とな
る。これによってオア回路２９からラストフロー検出信
号が出力される。On the other hand, when the multi-flow processing consists of two flows, the two-flow operation signal is given from the instruction decoder 1 to the uppermost input terminal of the AND circuit 27. Therefore, the output of the AND circuit 27 becomes "1" when "1" is given to the second input terminal from the top and "0" is given to the third input terminal, and the output of the OR circuit 29 shows the last flow. become. That is, at this time, the output of the flow counter is "01". AND circuit 2
The D-stage release signal from the pipeline controller 6 is input to the signal 5 along with the clock signal, and the output of the flow counter 26 is incremented if the D-stage release signal is given at the rising edge of the clock signal. When the flow in the multi-flow process is "2", the output of the flow counter 26 is "0" for the first flow, and the D stage indicating that the D stage of the second flow is completed. When the release signal is input, the output of the flow counter 26 becomes "1". As a result, the OR circuit 29 outputs the last flow detection signal.

【００３９】マルチフロー処理が３つのフローから成る
場合には、スリーフローオペレーション信号がアンド回
路２８の１番上の入力端子に与えられ、アンド回路２８
の出力はその第２の入力端子への入力が‘０’第３の入
力端子への入力が‘１’となった時に‘１’となる。す
なわちフローカウンタ２６の出力は第３のフローに対し
てＤステージリリース信号が入力された時に‘２’すな
わち‘１０’となっており、この時点でオア回路２９か
らラストフロー検出信号が出力される。When the multi-flow processing consists of three flows, the three-flow operation signal is given to the uppermost input terminal of the AND circuit 28, and the AND circuit 28 is supplied.
Is "1" when the input to the second input terminal is "0" and the input to the third input terminal is "1". That is, the output of the flow counter 26 is "2", that is, "10" when the D stage release signal is input for the third flow, and at this time point, the OR circuit 29 outputs the last flow detection signal. .

【００４０】また図４において、クロック信号の立ち上
がり時にＤステージリリース信号が‘１’でラストフロ
ー検出信号が‘１’である時、すなわちラストフローが
検出された後、次のマルチフロー処理の最初のフローに
対してＤステージリリース信号が出力された時にはフロ
ーカウンタがリセットされ、クロック信号の立ち上がり
時にＤステージリリース信号が‘１’、ラストフロー検
出信号が‘０’である時にフローカウンタはインクリメ
ントされ、クロック信号の立ち上がり時にＤステージリ
リース信号が‘０’である時にはフローカウンタの出力
は変化しないことが示されている。この作用はインバー
タ３０によって行われる。Further, in FIG. 4, when the D stage release signal is "1" and the last flow detection signal is "1" at the rising of the clock signal, that is, after the last flow is detected, the first of the next multiflow processing is performed. Flow counter is reset when the D stage release signal is output for the flow of, and the flow counter is incremented when the D stage release signal is "1" and the last flow detection signal is "0" at the rising of the clock signal. It is shown that the output of the flow counter does not change when the D stage release signal is "0" at the rising edge of the clock signal. This operation is performed by the inverter 30.

【００４１】図５は本発明におけるスコアボードの実施
例構成ブロック図である。同図において、スコアボード
は２つの４入力１６出力デコーダ３３，３４，１６個の
ＳＲフリップフロップ３５、３個の１６入力１出力セレ
クタ３６，３７および３８から構成されている。FIG. 5 is a block diagram of a scoreboard according to an embodiment of the present invention. In the figure, the scoreboard is composed of two 4-input 16-output decoders 33, 34, 16 SR flip-flops 35, and 3 16-input 1-output selectors 36, 37 and 38.

【００４２】図５において、デコーダ３３はアンド回路
２１からの４ビットのＷＲＳＥＴ信号の内容に従っ
て、Ｄステージライトイネーブル信号（ＤＷＥ）の入
力時に１６個のＳＲフリップフロップ３５のいずれかを
セットするものであり、またデコーダ３４はアンド回路
２２の出力する４ビットのＷＲＲＥＳ信号の内容に従
って、Ｗステージライトイネーブル信号（ＷＷＥ）の
入力時にＳＲフリップフロップ３５のいずれかをリセッ
トするものである。In FIG. 5, the decoder 33 sets any of the 16 SR flip-flops 35 when the D stage write enable signal (D WE) is input according to the contents of the 4-bit WR SET signal from the AND circuit 21. The decoder 34 resets one of the SR flip-flops 35 when the W stage write enable signal (W WE) is input according to the contents of the 4-bit WR RES signal output from the AND circuit 22.

【００４３】次にセレクタ３６は、命令デコーダ１から
出力される読み出しレジスタ番号ＳＲＣＲＥＧＡＤ
Ｒ０信号としてのスコアボード読み出しレジスタ検査ポ
ートへの入力信号ＲＤ０ＣＨＫ４ビットの内容に従っ
て、１６個のＳＲフリップフロップ３５のいずれか１つ
の出力をパイプラインコントローラ６に与えるＳＲＣ０
ＲＥＧＢＵＳＹ信号として出力するものである。同
様にレジスタ３７は読み出しレジスタ検査ポートへの入
力信号ＲＤ１ＣＨＫ４ビットの内容に従って、１６個
のフリップフロップ３５のいずれかの出力をＳＲＣ１
ＲＥＧＢＵＳＹ信号としてパイプラインコントローラ
６に出力し、またセレクタ３８は書き込みレジスタ検査
ポートへの入力信号ＷＲＣＨＫ４ビットの内容に従っ
て、１６個のフリップフロップ３５の出力のいずれかを
ＷＲＲＥＧＢＵＳＹ信号としてパイプラインコント
ローラ６に出力するものである。Next, the selector 36 outputs the read register number SRC REG AD output from the instruction decoder 1.
Input signal RD0 to the scoreboard read register check port as R0 signal RD0 SRC0 which provides any one output of 16 SR flip-flops 35 to the pipeline controller 6 according to the contents of 4 bits.
It is output as a REG BUSY signal. Similarly, the register 37 outputs one of the outputs of the 16 flip-flops SRC1 according to the contents of the input signal RD1 CHK4 bit to the read register check port.
The selector 38 outputs it to the pipeline controller 6 as a REG BUSY signal, and the selector 38 outputs one of the outputs of the 16 flip-flops 35 as a WR REG BUSY signal according to the contents of the input signal WR CHK 4 bits to the write register check port. It is output to the controller 6.

【００４４】図６はパイプラインコントローラの実施例
の構成ブロック図である。図３において、パイプライン
コントローラ６にはスコアボード５からの３つのＢＵＳ
Ｙ信号とその他のインターロック条件信号が与えられる
が、３つのＢＵＳＹ信号とその他のＤステージインター
ロック条件信号はオア回路４０に入力される。アンド回
路４１には、オア回路４０の出力とＤステージバリッド
信号、すなわちＤステージを実行している命令があるこ
とを示す信号と、後述するアンド回路４５の出力が入力
される。アンド回路４１の出力はＤステージを実行して
いる命令が完了したことを示すＤステージリリース信号
であり、その値はＤステージバリッド信号が‘１’であ
り、オア回路４０、及びアンド回路４５の出力が共に
‘０’である時に‘１’となる。すなわちＤステージバ
リッド信号が‘１’であり、Ｄステージが有効であっ
ても、オア回路４０が‘１’を出力し、Ｄステージイン
ターロック条件がある場合、またはアンド回路４５が
‘１’を出力し、Ｅステージで実行中の命令がインター
ロックしている場合にはアンド回路４１の出力は‘１’
とならない。これは例えばＤステージで実行される命令
がＥステージに進んでしまうとＥステージでインターロ
ックしている命令が完了しないまま、例えばレジスタへ
の上書きが行われてしまうためであり、このようなスコ
アボード検査によるインターロックはＤインターロック
条件の１つと考えられる。FIG. 6 is a block diagram showing the configuration of an embodiment of the pipeline controller. In FIG. 3, the pipeline controller 6 has three BUSs from the scoreboard 5.
The Y signal and the other interlock condition signals are provided, but the three BUSY signals and the other D stage interlock condition signals are input to the OR circuit 40. The AND circuit 41 receives the output of the OR circuit 40, a D stage valid signal, that is, a signal indicating that there is an instruction executing the D stage, and the output of an AND circuit 45 described later. The output of the AND circuit 41 is a D stage release signal indicating that the instruction executing the D stage is completed, and its value is the D stage valid signal being “1”, and the OR circuit 40 and the AND circuit 45 output the value. It becomes "1" when both outputs are "0". That is, even if the D stage valid signal is "1" and the D stage is valid, the OR circuit 40 outputs "1" and there is a D stage interlock condition, or the AND circuit 45 outputs "1". If the instruction being output is interlocked, the output of the AND circuit 41 is "1".
It does not become. This is because, for example, if an instruction executed in the D stage progresses to the E stage, for example, a register is overwritten without completing the interlocked instruction in the E stage. Interlock by board inspection is considered to be one of the D interlock conditions.

【００４５】アンド回路４１の出力としてのＤステージ
リリース信号は、フリップフロップ４２を介してＥステ
ージバリッド信号、すなわちＤステージの終了によって
Ｅステージを実行している命令があることを示す信号と
してアンド回路４３、及び４５に与えられる。アンド回
路４３には、図３におけるパイプラインコントローラ６
への他のインターロックコンディション信号としてのＥ
ステージインターロックコンディション信号と、後述す
るアンド回路４８の出力とが与えられており、またこれ
らの２つの信号はオア回路４４を介してアンド回路４５
に与えられている。すなわちアンド回路４５の出力は、
前述のようにＥステージで実行されている命令がＥステ
ージインターロックコンディションによってインターロ
ックしていることを示している。The D stage release signal as the output of the AND circuit 41 is an E stage valid signal via the flip-flop 42, that is, a signal indicating that there is an instruction executing the E stage by the end of the D stage. 43 and 45. The AND circuit 43 includes a pipeline controller 6 in FIG.
E as another interlock condition signal to
The stage interlock condition signal and the output of the AND circuit 48 described later are given, and these two signals are supplied to the AND circuit 45 via the OR circuit 44.
Is given to. That is, the output of the AND circuit 45 is
As described above, the instruction executed in the E stage is interlocked by the E stage interlock condition.

【００４６】アンド回路４３の出力は、Ｅステージバリ
ッド信号が‘１’でありＥステージインターロックコン
ディション信号とアンド回路４８の出力とが共に‘０’
である時に‘１’となる。アンド回路４３の出力はＥス
テージリリース信号、すなわちＥステージを実行してい
る命令が完了したことを示すものであり、Ｄステージリ
リース信号と同様にＥステージが有効であってもＥステ
ージインターロックコンディションがあるか、またはＷ
ステージで実行されるべき命令がインターロックしてい
る場合にはその値は‘１’とならない。すなわちＥステ
ージで実行されている命令がＷステージに進んでしまう
と、Ｗステージでインターロックしている命令が完了し
ないまま、例えばレジスタへの上書きが行われてしまう
ことになる。As for the output of the AND circuit 43, the E stage valid signal is "1" and both the E stage interlock condition signal and the output of the AND circuit 48 are "0".
When it is, it becomes '1'. The output of the AND circuit 43 indicates that the E stage release signal, that is, the instruction executing the E stage is completed. Even if the E stage is valid, the E stage interlock condition is generated as in the D stage release signal. There is or W
If the instruction to be executed at the stage is interlocked, its value will not be '1'. That is, if the instruction executed in the E stage advances to the W stage, the instruction interlocked in the W stage will be overwritten to the register, for example, without completing the instruction.

【００４７】アンド回路４３の出力、すなわちＥステー
ジリリース信号はフリップフロップ４６を介してＷステ
ージを実行している命令があることを示すＷステージバ
リッド信号としてアンド回路４７に与えられる。アンド
回路４７には、Ｗステージインターロックコンディショ
ン信号が与えられており、Ｗステージバリッド信号が
‘１’であり、Ｗステージインターロックコンディショ
ン信号が‘０’である時にアンド回路４７からＷステー
ジリリース信号、すなわちＷステージを実行している命
令が完了したことを示す信号が出力される。一方アンド
回路４８には、Ｗステージバリッド信号とＷステージイ
ンターロックコンディション信号とが入力されており、
これらが‘１’である時にはアンド回路４８の出力が
‘１’となり、その出力は前述のようにアンド回路４３
及びオア回路４４に与えられる。The output of the AND circuit 43, that is, the E stage release signal is given to the AND circuit 47 via the flip-flop 46 as a W stage valid signal indicating that there is an instruction executing the W stage. A W stage interlock condition signal is given to the AND circuit 47, and when the W stage valid signal is "1" and the W stage interlock condition signal is "0", the AND circuit 47 outputs the W stage release signal. , That is, a signal indicating that the instruction executing the W stage is completed is output. On the other hand, the W stage valid signal and the W stage interlock condition signal are input to the AND circuit 48,
When these are "1", the output of the AND circuit 48 becomes "1", and the output is the AND circuit 43 as described above.
And the OR circuit 44.

【００４８】次に並列命令実行時における本発明の実施
例について説明する。図１３で説明した問題点を解決す
るためには、マルチフロー処理を必要とする命令と、単
一のフローのみの命令とを並列実行する場合には、マル
チフロー処理の最終フローでスコアボードのセット／リ
セットを行うと共に、単一のフローのみの命令の実行を
その最終フローの時点まで遅延させることが必要にな
る。Next, an embodiment of the present invention when executing parallel instructions will be described. In order to solve the problem described in FIG. 13, in the case where an instruction that requires multi-flow processing and an instruction that requires only a single flow are executed in parallel, the scoreboard in the final flow of the multi-flow processing It is necessary to perform set / reset and delay execution of instructions for a single flow only until the point of its final flow.

【００４９】従って、例えばロード命令（メモリアクセ
ス命令）と乗算命令とを同時に実行する場合には単一フ
ローとしてのロード命令の実行とスコアボードのセット
／リセットを乗算命令の最終フローまで遅延させること
によってデータの相互干渉を防止する処理が可能とな
る。Therefore, for example, when a load instruction (memory access instruction) and a multiply instruction are executed simultaneously, execution of the load instruction as a single flow and setting / resetting of the scoreboard are delayed until the final flow of the multiply instruction. This makes it possible to perform processing to prevent mutual interference of data.

【００５０】以上説明した実施例においては、演算に用
いるデータは２個であり、従って読み出しレジスタが２
個、書き込みレジスタが１個の場合を説明したが、これ
らのデータ及びレジスタの数がこれに限定されないこと
は当然である。またこれらのレジスタ番号（アドレス）
が４ビットであり、従って図５におけるスコアボード内
のフリップフロップが１６個の場合を説明したが、これ
らのビット数及びフリップフロップの数もこれに限定さ
れないことは当然である。更にマルチフローのフローの
数を２、または３として図４のマルチフローコントロー
ラを説明したが、マルチフロー処理のフローの数がこれ
に限定されないことは当然である。In the embodiment described above, the number of data used for the calculation is two, and therefore the number of read registers is two.
Although the case where the number of the write registers is one and the number of the write registers is one has been described, it goes without saying that the numbers of these data and registers are not limited to this. In addition, these register numbers (addresses)
Is 4 bits, and therefore the number of flip-flops in the scoreboard in FIG. 5 is 16, but the number of bits and the number of flip-flops are not limited to this. Furthermore, the multiflow controller in FIG. 4 has been described with the number of multiflow flows being two or three, but the number of multiflow processing flows is not limited to this.

【００５１】[0051]

【発明の効果】以上詳細に説明したように、本発明によ
れば１つの命令を複数のフローに分けて処理するマルチ
フロー処理に対してもスコアボードを使用してデータの
相互依存関係を崩すことなく処理を行うことが可能とな
り、また命令を並列に実行する場合にもマルチフロー処
理を必要とする命令に対してはスコアボードのセット／
リセットを最終フローにおいて行い、マルチフロー処理
を必要としない命令をその最終フローまで実行を遅延さ
せることによってデータの相互干渉を解決することが可
能となる。As described in detail above, according to the present invention, the interdependence of data is destroyed by using the scoreboard even for multi-flow processing in which one instruction is divided into a plurality of flows and processed. It is possible to perform processing without using the scoreboard, and set / set the scoreboard for instructions that require multiflow processing even when the instructions are executed in parallel.
By performing the reset in the final flow and delaying the execution of an instruction that does not require multiflow processing until the final flow, mutual interference of data can be resolved.

[Brief description of drawings]

【図１】本発明の原理ブロック図である。FIG. 1 is a principle block diagram of the present invention.

【図２】本発明における乗算命令の実行の様子を示す図
である。FIG. 2 is a diagram showing how a multiply instruction is executed in the present invention.

【図３】本発明におけるパイプライン処理計算機の実施
例の構成を示すブロック図である。FIG. 3 is a block diagram showing a configuration of an embodiment of a pipeline processing computer according to the present invention.

【図４】マルチフローコントローラの実施例の構成を示
すブロック図である。FIG. 4 is a block diagram showing a configuration of an embodiment of a multi-flow controller.

【図５】スコアボードの実施例の構成を示すブロック図
である。FIG. 5 is a block diagram showing a configuration of an embodiment of a scoreboard.

【図６】パイプラインコントローラの実施例の構成を示
すブロック図である。FIG. 6 is a block diagram showing a configuration of an embodiment of a pipeline controller.

【図７】後続の命令が先行命令の完了を待たない場合の
動作を示す図である。FIG. 7 is a diagram showing an operation when a subsequent instruction does not wait for completion of the preceding instruction.

【図８】スコアボードを用いた処理の動作を示す図であ
る。FIG. 8 is a diagram showing an operation of processing using a scoreboard.

【図９】パイプライン処理計算機の従来例の構成を示す
ブロック図である。FIG. 9 is a block diagram showing a configuration of a conventional example of a pipeline processing computer.

【図１０】マルチフロー処理の例を示す図である。FIG. 10 is a diagram illustrating an example of multiflow processing.

【図１１】乗算命令のマルチフロー処理の例を示す図で
ある。FIG. 11 is a diagram illustrating an example of multiflow processing of a multiplication instruction.

【図１２】乗算命令のハングアップ状態の例を示す図で
ある。FIG. 12 is a diagram showing an example of a hang-up state of a multiplication instruction.

【図１３】命令の並列実行における問題点を説明する図
である。FIG. 13 is a diagram illustrating a problem in parallel execution of instructions.

[Explanation of symbols]

１命令デコーダ２汎用レジスタ（ＧＲ）３，４演算器５，１１スコアボード６パイプラインコントローラ７パイプラインタグレジスタ１０最終フロー検出手段１２スコアボード更新制御手段２０マルチフローコントローラ DESCRIPTION OF SYMBOLS 1 instruction decoder 2 general purpose register (GR) 3,4 arithmetic unit 5,11 scoreboard 6 pipeline controller 7 pipeline tag register 10 final flow detection means 12 scoreboard update control means 20 multiflow controller

Claims

[Claims]

1. A scoreboard for checking a data dependency between a preceding instruction executed and an instruction executed subsequent to the preceding instruction, and one instruction is provided in a plurality of flows. In a pipeline processing computer that can be executed separately, a final flow detecting means (10) for detecting a final flow of the plurality of flows, and a scoreboard () corresponding to the dependency in the detected final flow ( A pipeline processing computer, comprising: a scoreboard update control means (11) for updating the stored contents of 12), and guaranteeing a process that does not disturb the dependency.

2. When the pipeline processing computer has at least one instruction to be executed in parallel in a plurality of flows among the instructions to be executed in parallel, , Updating the stored content of the scoreboard in the final flow of the plurality of flows, and executing the instruction and updating the stored content of the scoreboard for instructions that can be executed in a single flow. 2. The pipeline processing computer according to claim 1, wherein the pipeline processing computer delays until a final flow of an instruction to be executed divided into a plurality of flows.