JPH04130926A

JPH04130926A - Instruction supplying system

Info

Publication number: JPH04130926A
Application number: JP25037890A
Authority: JP
Inventors: Tatsuhiro Goshima; 龍宏五島
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1990-09-21
Filing date: 1990-09-21
Publication date: 1992-05-01

Abstract

PURPOSE:To easily obtain the predictive order instruction address of each control instruction and to execute a rapid parallel processing by holding a predictive order instruction address to be executed next in an instruction string and supplying an instruction to a parallel processing part. CONSTITUTION:An instruction (i) and the predictive order instruction address are read out by an address register 11 in an instruction storage device 1 and outputted to a router 7 through a path A. When the identification(ID) flag of input data inputted from a path B is '0' the router 7 decides the data is an instruction and supplies the data to a parallel processor 13 through a path D. When the ID flag of the input data is '1', the predictive order instruction address is outputted to a selector 9 through a path C, and when the data from the path A is a normal instruction, the value of the predictive order instruction address register is increased. When the data from the path A is the predictive order instruction address, the value of the address concerned is set up in the address register 11. After determining a control instruction, the processor 13 outputs an address to be branched through a path F, and when the prediction is failed, a correct instruction set up in the register 11 is refetched.

Description

【発明の詳細な説明】〔発明の目的〕（産業上の利用分野）本発明は、プログラムを高速に実行するために命令レベ
ルの並列実行を行う計算機の命令供給方式に関する。DETAILED DESCRIPTION OF THE INVENTION [Object of the Invention] (Industrial Application Field) The present invention relates to an instruction supply method for a computer that performs parallel execution at the instruction level in order to execute programs at high speed.

（従来の技術）一般に、プログラムを高速に実行するために命令レベル
の並列実行方式を採用する計算機がほとんどである。上
記並列実行方式としてはパイプライン方式が最も普及し
ている。上記パイプライン方式の計算機を最大効率で動
作させるためには、命令を途切れなく処理装置に供給し
なければならない。しかし、命令間にデータの参照関係
等の処理がある場合には、命令の連続的な供給ができな
くなるため、パイプライン並列実行の障害になり処理速
度が低下する。特に、プログラム中に存在する分岐命令
などの制御命令は、命令の実行順序を変化させるために
、命令の連続的な供給の大きな障害となる。(Prior Art) In general, most computers employ an instruction-level parallel execution method in order to execute programs at high speed. The pipeline method is the most popular among the above parallel execution methods. In order to operate the pipelined computer at maximum efficiency, instructions must be supplied to the processing unit without interruption. However, if there is a process such as a data reference relationship between instructions, the instructions cannot be continuously supplied, which impedes pipeline parallel execution and reduces processing speed. In particular, control instructions such as branch instructions that are present in a program change the execution order of instructions, and thus become a major hindrance to the continuous supply of instructions.

上記制御命令による速度低下の影響を低減させるために
、いくつかの対応策が実施されている。Several countermeasures have been implemented to reduce the speed reduction effects of the control commands described above.

上記対応策としては、遅延分岐とよばれる命令実行順序
の静的なスケジューリングをソフトウェアにより実行す
る方法であり、ハードウェアとじては分岐予測とよばれ
る命令実行順序の動的なスケジューリングである。The above-mentioned countermeasure is a method of performing static scheduling of the order of instruction execution called delayed branching using software, and a method of performing dynamic scheduling of the order of instruction execution called branch prediction using hardware.

制御命令を含む命令式を模式的に示した第２図を用いて
上記具体例を説明する。The above specific example will be explained using FIG. 2, which schematically shows an instruction formula including a control instruction.

例えば、ＦＯＲＴＲＡＮ等の高級言語で記述したプログ
ラム中のＩＦ文は、一般にコンパイラによって条件分岐
命令として制御命令に変換する。For example, an IF statement in a program written in a high-level language such as FORTRAN is generally converted into a control instruction as a conditional branch instruction by a compiler.

上記制御命令は１１個であり、４つの基本ブロック（Ｂ
ＢＩ〜ＢＢ４）の構成である。上記基本ブロックは、プ
ログラムを実行制御の点から捉えた単位である。任意の
基本ブロックの先頭命令にプログラムの制御が移った場
合（具体的には、例えば、処理中となっている命令のア
ドレスを示すプログラムカウンタの値が基本ブロックの
先頭命令のアドレスを示した場合）には、当該基本ブロ
ックを構成する全命令が必ず実行される。また、基本ブ
ロックは、命令列中に存在する制御命令の移動先（例え
ば分岐命令のジャンプ先アドレス）によって決定される
。従って、基本ブロックを構成する命令数はプログラム
の性質に依存し任意の数を取り得る。There are 11 control instructions above, and 4 basic blocks (B
BI to BB4). The above-mentioned basic block is a unit in which a program is viewed from the point of view of execution control. When program control is transferred to the first instruction of any basic block (specifically, for example, when the value of the program counter indicating the address of the instruction being processed indicates the address of the first instruction of the basic block) ), all instructions constituting the basic block are always executed. Further, the basic block is determined by the destination of a control instruction (for example, the jump destination address of a branch instruction) in the instruction string. Therefore, the number of instructions constituting a basic block depends on the nature of the program and can be any number.

上記基本ブロックの実行順序を第３図を用いて説明する
。The execution order of the above basic blocks will be explained using FIG.

上記基本ブロックＢＢＩは、条件分岐命令を含んでいる
ため、条件成立の正否によって次に実行する基本ブロッ
クが変化する。従って、基本ブロックＢＢＩから基本ブ
ロックＢＢ２または基本ブロックＢＢ３に進んで基本ブ
ロックＢＢ４に進む。Since the basic block BBI includes a conditional branch instruction, the basic block to be executed next changes depending on whether the condition is satisfied or not. Therefore, the process proceeds from basic block BBI to basic block BB2 or basic block BB3, and then to basic block BB4.

上記命令列をパイプライン方式の計算機により実行する
場合、パイプラインは命令解読（Ｄ）　ステージ、オペ
ランド読取り（０）ステージ、実行（Ｅ）ステージおよ
び書込み（Ｗ）ステージの４ステージから構成され、条
件分岐命令のジャンプ先アドレスの計算および条件判定
はＥステージで処理される。When the above instruction sequence is executed by a pipelined computer, the pipeline consists of four stages: an instruction decoding (D) stage, an operand reading (0) stage, an execution (E) stage, and a writing (W) stage. The calculation of the jump destination address of the branch instruction and the condition determination are processed in the E stage.

次に、上記命令列の命令のスケジューリングを全く実行
しない場合をパイプライン中の命令の時間の変化を示す
第３図を用いて説明する。Next, a case in which scheduling of the instructions in the instruction sequence is not executed at all will be described with reference to FIG. 3, which shows changes in instruction time in the pipeline.

上記命令は、メモリに配置された順序にパイプラインに
投入され、図中横軸方向に時間の経過を示す。上記パイ
プラインの処理中に命令供給の途切れがあり、当該途切
れは基本ブロックＢＢＩに含まれる条件分岐命令により
、制御がプログラム上での順序の基本ブロックＢＢ２で
はなく、基本ブロックＢＢ３に移った結果生じたもので
ある。The above instructions are input into the pipeline in the order in which they are arranged in the memory, and the passage of time is shown in the horizontal axis direction in the figure. There is an interruption in the supply of instructions during the processing of the above pipeline, and this interruption occurs as a result of the control being transferred to basic block BB3 instead of basic block BB2 in the program order due to a conditional branch instruction included in basic block BBI. It is something that

つまり、基本ブロックＢＢＩの分岐先命令が実行（Ｅ）
ステージでの処理を完了するまでに既にパイプライン中
に投入された基本ブロックＢＢ２の「２」命令は、実際
には実行してはいけない命令である。In other words, the branch destination instruction of basic block BBI is executed (E)
The "2" instruction of basic block BB2, which has already been input into the pipeline by the time the stage processing is completed, is an instruction that should not actually be executed.

（発明が解決しようとする課題）ところで、上記遅延命令によるスケジューリングは、基
本ブロックＢＢｌ中の命令の実行順序を変えて必ず実行
される命令を条件分岐命令の次への移動を行う。これに
より実行されない命令をパイプラインに取り込まないで
済むようになり、処理効率が増加する。しかし、第２図
に示した如くスケジューリングが不可能な場合には遅延
分岐による効果が少なくなる。また、パイプラインのス
テジューラの開発は困難になるという問題があった。(Problems to be Solved by the Invention) By the way, the above-mentioned scheduling using delayed instructions changes the execution order of the instructions in the basic block BBl and moves the instructions that are always executed to the next one after the conditional branch instruction. This eliminates the need to include instructions that will not be executed into the pipeline, increasing processing efficiency. However, as shown in FIG. 2, when scheduling is impossible, the effect of delayed branching is reduced. Another problem is that it becomes difficult to develop a pipeline scheduler.

一方、分岐予測は、制御命令の分岐先を当該制御命令を
過去に実行したときの分岐の履歴を基に予測して、より
実行される確率の高い命令をパイプラインに投入するも
のである。従って、特定の制御命令を最初に実行する場
合は過去の分岐の履歴がないので、分岐は起こらないも
のと予測して制御命令のプログラム上での次命令をパイ
プラインに投入することが多いため、分岐の予測が的中
する限り高速の処理が達成できる。しかし、実際には分
岐の方向に偏りがない制御命令もあったり、各制御命令
について分岐の履歴を保持することが難しいなどの欠点
もある。特に、従来の分岐予測には分岐テーブル等の多
くのハードウェアが必要であるため、遅延分岐などのソ
フトウェアによるスケジューリング方法と比較するとコ
ストパフォーマンスの点で問題があった。On the other hand, branch prediction predicts the branch destination of a control instruction based on the history of branches when the control instruction was executed in the past, and inserts an instruction with a higher probability of being executed into the pipeline. Therefore, when a specific control instruction is executed for the first time, there is no history of past branches, so in many cases the next instruction in the program of the control instruction is input into the pipeline with the prediction that the branch will not occur. , high-speed processing can be achieved as long as branch predictions are correct. However, in reality, there are some control instructions that have no bias in the direction of branching, and there are also disadvantages such as it is difficult to maintain branch history for each control instruction. In particular, since conventional branch prediction requires a large amount of hardware such as a branch table, it has a cost performance problem when compared to software scheduling methods such as delayed branching.

本発明は、上記に鑑みてなされたものであり、その目的
は、各制御命令の予測次命令アドレスを容易に得ること
ができ、且つ、命令実行順序のスケシニーリングを容易
にすることにより、迅速な並列処理の実行を実現する命
令供給方式を提供することである。The present invention has been made in view of the above, and an object of the present invention is to easily obtain the predicted next instruction address of each control instruction, and to quickly perform scheduling of the order of instruction execution. The purpose of this invention is to provide an instruction supply method that realizes execution of parallel processing.

[Structure of the invention]

（課題を解決するための手段）上記目的を達成するため、本発明は、命令列中に次に実
行される命令の格納されているアドレスを指す予測次命
令アドレスおよび当該予測次命令アドレスの指す当該命
令の並列処理を実行する並列処理部に供給される命令を
記憶する命令記憶部と、この命令記憶部に記憶されている予測次命令アドレスお
よび命令を読み出し、当該読み出した当該予測次命令ア
ドレスおよび命令を判別して当該命令を前記並列処理部
に供給する手段と、を備えたことを要旨とする。(Means for Solving the Problems) In order to achieve the above object, the present invention provides a predicted next instruction address that points to an address where an instruction to be executed next in an instruction sequence is stored, and a predicted next instruction address that points to the predicted next instruction address. an instruction storage unit that stores instructions to be supplied to a parallel processing unit that executes parallel processing of the instruction; and a predicted next instruction address and instruction stored in the instruction storage unit, and the read predicted next instruction address. and means for determining an instruction and supplying the instruction to the parallel processing unit.

（作用）上記構成を備えた命令供給方式においては、命令列中に
次に実行される命令の格納されているアドレスを指す予
測次命令アドレスおよび当該予測次命令アドレスの指す
前記並列処理部に供給される命令の記憶されている命令
記憶部から当該予測次命令アドレスおよび命令を読み出
す。この読み出した予測次命令アドレスおよび命令を判
別して当該命令を並列処理部に供給するので、迅速な並
列処理を実現できる。(Operation) In the instruction supply method having the above configuration, the predicted next instruction address indicating the address where the next instruction to be executed is stored in the instruction string and the predicted next instruction address specified by the predicted next instruction address are supplied to the parallel processing unit. The predicted next instruction address and instruction are read from the instruction storage section in which the instruction to be executed is stored. Since the read predicted next instruction address and instruction are determined and the instructions are supplied to the parallel processing section, rapid parallel processing can be realized.

（実施例）以下、図面を用いて本発明の詳細な説明する。(Example) Hereinafter, the present invention will be explained in detail using the drawings.

第１図は本発明の命令供給方式に係る一実施例の制御を
示すブロック図である。FIG. 1 is a block diagram showing control of an embodiment of the instruction supply system of the present invention.

同図において、命令記憶装置１は、命令列中に次に実行
される命令アドレスの予測次命令アドレスおよび当該命
令列の命令を備えている。なお、予測次命令アドレスを
保持するワード設定には、１命令毎に設ける、一定命令
数毎に設ける、制御命令毎に設ける方法がある。また、
命令記憶装置１は、上記命令および予測次命令アドレス
を識別する識別フラグ３を備えて、命令ｉには「０」、
予測次命令アドレスには「１」をセットしている。In the figure, an instruction storage device 1 includes a predicted next instruction address of an instruction address to be executed next in an instruction string and an instruction of the instruction string. Note that the word setting for holding the predicted next instruction address includes a method of setting a word for each instruction, a method of setting a word for each fixed number of instructions, and a method of setting a word for each control instruction. Also,
The instruction storage device 1 includes an identification flag 3 for identifying the above-mentioned instruction and the predicted next instruction address.
The predicted next instruction address is set to "1".

上記命令記憶装置１は、識別フラグのセットされている
命令および予測次命令アドレスを当該予測次命令アドレ
ス順に備えている。The instruction storage device 1 includes instructions whose identification flags are set and predicted next instruction addresses in the order of the predicted next instruction addresses.

命令フェッチ機構５は、ルータ７、セレクタ９およびア
ドレスレジスタ１１を有し、上記命令記憶装置１からデ
ータを読み取り、命令と予測次命令アドレスとを判別し
て後述する並列処理装置１３に出力する。即ち、ルータ
７は、命令記憶装置１から読み取られる命令および予測
次命令アドレスの識別フラグをバスＢを介して分類し、
当該識別フラグが「０」のとき命令と判定して後述する
セレクタ９にバスＣを介して出力する。一方、識別フラ
グが「１」のときルータ７は、予測次命令アドレスと判
定してバスＤを介して後述する並列処理装置１３に出力
する。The instruction fetch mechanism 5 includes a router 7, a selector 9, and an address register 11, reads data from the instruction storage device 1, determines an instruction and a predicted next instruction address, and outputs the same to a parallel processing device 13, which will be described later. That is, the router 7 classifies the instruction and predicted next instruction address identification flags read from the instruction storage device 1 via the bus B, and
When the identification flag is "0", it is determined to be a command and is output to a selector 9, which will be described later, via a bus C. On the other hand, when the identification flag is "1", the router 7 determines that it is the predicted next instruction address and outputs it via the bus D to the parallel processing device 13, which will be described later.

セレクタ９は、命令記憶装置１のアドレスを示すアドレ
スレジスタ１１の値を決定するものであり、バスＡから
の命令および予測次命令アドレスのデータが通常の命令
の場合、セレクタ９は、アドレスレジスタ１１の値をパ
スＥ上でインクリメントする。一方、予測次命令アドレ
スの場合、セレクタ９は当該アドレスレジスタ１１の値
をアドレスレジスタ１１に設定する。The selector 9 determines the value of the address register 11 indicating the address of the instruction storage device 1. When the instruction from the bus A and the predicted next instruction address data are normal instructions, the selector 9 determines the value of the address register 11 indicating the address of the instruction storage device 1. Increment the value of on path E. On the other hand, in the case of the predicted next instruction address, the selector 9 sets the value of the address register 11 in the address register 11 .

並列処理装置１３は、バスＤを介して入力される命令に
よりパイプライン方式等による命令レベルでの並列処理
を実行し、制御命令が確定すると分岐先アドレスをバス
Ｆを介して出力する。The parallel processing device 13 executes parallel processing at the instruction level using a pipeline system or the like based on instructions input via the bus D, and outputs a branch destination address via the bus F when the control instruction is determined.

次に本実施例の作用を説明する。Next, the operation of this embodiment will be explained.

まず、装置に電源投入後、システムが起動して、命令記
憶装置１のアドレスレジスタ１１により示されるアドレ
スのデータ、即ち、命令ｉおよび予測次命令アドレスが
読み出される。読み出された命令ｉおよび予測次命令ア
ドレスは、バスＡを介してルータフに出力される。ルー
タ７は、バスＢを介して入力されたデータの識別フラグ
がｒＯＪのとき命令と判断して、バスＤを介して後述す
る並列処理装置１３に供給する。上記入力されたデータ
の識別フラグが「１」のとき予測次命令アドレスがバス
Ｃを介してセレクタ９に出力する。セレクタ９は、バス
Ａからのデータが通常の命令のとき当該予測次命令アド
レスレジスタの値をインクリメントし、予測次命令アド
レスのとき当該予測次命令アドレスの値をアドレスレジ
スタ１１に設定する。First, after the power is turned on to the device, the system is started and the data at the address indicated by the address register 11 of the instruction storage device 1, that is, the instruction i and the predicted next instruction address are read out. The read instruction i and predicted next instruction address are output to the router via bus A. When the identification flag of data input via bus B is rOJ, the router 7 determines that the data is a command, and supplies the data via bus D to a parallel processing device 13, which will be described later. When the identification flag of the input data is "1", the predicted next instruction address is output to the selector 9 via the bus C. The selector 9 increments the value of the predicted next instruction address register when the data from the bus A is a normal instruction, and sets the value of the predicted next instruction address in the address register 11 when the data from the bus A is a predicted next instruction address.

上記パスＤを介して命令が供給される並列処理装置１３
は、制御命令が確定するとパスＦを介して分岐先アドレ
スを出力し、一方、予ｎ１が外れたときアドレスレジス
タ１１に設定された正しい命令を再フエツチする。また
、並列処理装置１３は、処理が終了すると命令配憶装置
１の予測次アドレスフィールドをパスＦを介して更新す
る。Parallel processing device 13 to which instructions are supplied via the path D
outputs the branch destination address via path F when the control instruction is confirmed, and re-fetches the correct instruction set in the address register 11 when pren1 fails. Further, the parallel processing device 13 updates the predicted next address field of the instruction storage device 1 via the path F when the processing is completed.

これにより、予測次命令アドレスを容易に得ることがで
き、且つ、計算機に途切れなく命令を供給して、迅速に
並列処理を実現できる。Thereby, the predicted next instruction address can be easily obtained, and instructions can be supplied to the computer without interruption, thereby quickly realizing parallel processing.

〔Effect of the invention〕

以上説明したように、本発明によれば、命令列中に次に
実行される予測次命令アドレスを保持して並列処理部に
命令を供給し続けるので、各制御命令の予測次命令アド
レスを容易に得ることができ、且つ、命令実行順序のス
ケジューリングを容易にすることにより、迅速な並列処
理の実行を実現できる。As explained above, according to the present invention, the predicted next instruction address to be executed next is held in the instruction string and the instructions are continuously supplied to the parallel processing unit, so that the predicted next instruction address of each control instruction can be easily determined. In addition, by facilitating the scheduling of instruction execution order, rapid parallel processing can be realized.

[Brief explanation of the drawing]

第１図は本発明の命令供給方式に係る一実施例の制御を
示すブロック図、第２図は制御命令を含む命令式を模式
的に示した図、第３図は基本ブロックの実行順序を示し
た図、第４図はパイプライン中の命令の時間の変化を示
す図である。１・・・命令記憶装置３・・・識別フラグ５・・・命令フェッチ機構７・・・ルータ９・・・セレクタ１１・・・アドレスレジスタ１３・・・並列処理装置Ａ、Ｂ、Ｃ，Ｄ、Ｅ、Ｆ・・・バス代腫人弁理士二好秀和FIG. 1 is a block diagram showing control of an embodiment of the instruction supply system of the present invention, FIG. 2 is a diagram schematically showing an instruction formula including control instructions, and FIG. 3 is a diagram showing the execution order of basic blocks. The illustrated diagram, FIG. 4, is a diagram showing changes in instruction time in the pipeline. 1... Instruction storage device 3... Identification flag 5... Instruction fetch mechanism 7... Router 9... Selector 11... Address register 13... Parallel processing devices A, B, C, D , E, F...Bus fare Hidekazu Fuyoshi, patent attorney for cancer patients

Claims

[Claims] In an instruction supply method that supplies instructions to a parallel processing unit that executes parallel processing of instructions, a predicted next instruction address indicating an address where an instruction to be executed next is stored in an instruction string; an instruction storage unit that stores an instruction to be supplied to the parallel processing unit pointed to by the predicted next instruction address; and a predicted next instruction address and the instruction stored in the instruction storage unit, and the read predicted next instruction address. and means for determining an instruction and supplying the instruction to the parallel processing unit.