JPH0545984B2

JPH0545984B2 -

Info

Publication number: JPH0545984B2
Application number: JP58085752A
Authority: JP
Inventors: Yoichi Shintani; Tsuguo Shimizu; Kenichi Wada; Akira Yamaoka
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1983-05-18
Filing date: 1983-05-18
Publication date: 1993-07-12
Also published as: JPS59212961A

Description

【発明の詳細な説明】〔発明の利用分野〕本発明はパイプライン方式のデイジタルコンピ
ユータに係り、特にアドレスコンフリクト、条件
分岐命令の高速化のために主演算装置及び従演算
装置とを有する形態のコンピユータの一層の高速
化に好適な階層型演算方式に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Application of the Invention] The present invention relates to a pipeline type digital computer, and particularly to a digital computer having a main arithmetic unit and a slave arithmetic unit in order to solve address conflicts and speed up conditional branch instructions. The present invention relates to a hierarchical calculation method suitable for further speeding up computers.

[Background of the invention]

このコンピユータでは、各命令の実行過程を複
数のステージに分け、異なる命令の異なるステー
ジを並列に実行することにより、実質上複数の命
令を並列に実行する。 In this computer, the execution process of each instruction is divided into a plurality of stages, and different stages of different instructions are executed in parallel, thereby effectively executing a plurality of instructions in parallel.

しかし、ある命令Ｂの処理に必要なデータが先
行する命令Ａが指定する演算結果を利用して求め
る場合には、命令Ａの演算結果が確定するまで、
命令Ｂの実行を遅延しなければならない。例えば
命令Ａが実行されるとインデツクスレジスタまた
はベースレジスタが書きかえられ、しかも命令Ｂ
はそのインデツクスレジスタまたはベースレジス
タの内容と命令Ｂ内に含まれるアドレス情報を加
算して主メモリアクセスのためのアドレスを計算
する場合がこれに相当する。このように、先行す
る命令Ａの演算結果を用いて命令Ｂの実行のため
アドレスを決めなければならない状態をアドレス
コンフリクトがあると呼ぶ。この場合、命令Ｂの
アドレス計算ステージは、命令Ａの演算が終了す
るまで遅延される。 However, when the data required for processing a certain instruction B is obtained using the calculation result specified by the preceding instruction A, until the calculation result of instruction A is determined,
Execution of instruction B must be delayed. For example, when instruction A is executed, the index register or base register is rewritten, and instruction B
This corresponds to the case where the address for main memory access is calculated by adding the contents of the index register or base register and the address information contained in instruction B. In this way, a state in which the address for executing instruction B must be determined using the operation result of the preceding instruction A is called an address conflict. In this case, the address calculation stage of instruction B is delayed until the operation of instruction A is completed.

先願（特願昭56−194002および特願昭58−
62143号）の階層型演算方式を用いたコンピユー
タにおいては、この遅延を少くするために、すべ
ての命令が必要とする演算を実行できる主演算装
置とは別に、一部の命令が要求する簡単な演算の
みを実行できる従演算装置とを設けた。この先願
においては、主演算装置でのみ演算の実行できる
命令Ｃの演算に要するサイクル数が大きい場合
に、この命令Ｃが主演算装置で実行されている時
に、従演算装置ではこの命令Ｃの演算をスキツプ
して次の命令Ａの演算を早期に実行し、この結果
をさらに後続の命令Ｂのアドレス計算におけるイ
ンデツクスレジスタまたはベースレジスタの内容
として用いることによつて、命令Ｂのアドレス計
算ステージの遅延を少なくしていた。つまりこの
先願において、命令Ａの従演算装置における演算
が主演算装置における演算よりも早期に実行でき
るのは、主演算装置においては命令Ｃの演算のた
めに多くのサイクル数（N1とする）が経過した
後、次命令Ａの演算が開始されるのに比して従演
算装置においては命令Ｃの演算をN1より少ない
サイクル数（NSとする）でスキツプし、しかも
スキツプした後、直ちに次命令Ａの演算を開始す
るからである。従つて、逆に命令Ｃの主演算装置
での演算サイクル数N1が、上記スキツプサイク
ル数NSに等しい程少い場合や、命令Ｃが従演算
装置においても演算実行を行うためN1と同じサ
イクル数が経過した後に次命令Ａの演算が開始さ
れる場合においては、命令Ａの従演算装置におけ
る演算が主演算装置における演算よりもそれほど
に実行できることは期待できない。 Prior application (Patent application 1987-194002 and Patent application 1982-
62143), in order to reduce this delay, in addition to the main processing unit that can execute the operations required by all instructions, the computer uses a A slave arithmetic unit that can only perform arithmetic operations is provided. In this prior application, when the number of cycles required for the operation of an instruction C that can be executed only in the main processing unit is large, when this instruction C is executed in the main processing unit, the operation of this instruction C is performed in the slave processing unit. The address calculation stage of instruction B is skipped and the operation of the next instruction A is executed early, and this result is used as the contents of the index register or base register in the address calculation of the subsequent instruction B. It minimized delays. In other words, in this prior application, the reason why the operation in the slave arithmetic unit of instruction A can be executed earlier than the operation in the main arithmetic unit is that the main arithmetic unit requires a large number of cycles (assumed to be N1) for the operation of instruction C. After the lapse of time, the operation of the next instruction A is started, but the slave operation unit skips the operation of the instruction C in fewer cycles than N1 (NS), and immediately after the skip, the operation of the next instruction A is started. This is because the calculation of A is started. Therefore, conversely, if the number of arithmetic cycles N1 of instruction C in the main arithmetic unit is so small that it is equal to the number of skip cycles NS, or if the number N1 of arithmetic operations is the same as N1 because instruction C also executes arithmetic operations in the slave arithmetic unit. When the operation of the next instruction A is started after the number of cycles has elapsed, it cannot be expected that the operation of the instruction A in the slave arithmetic unit can be executed much more than the operation in the main arithmetic unit.

同じ問題が条件付分岐命令のときにも生ずる。
この命令（以下ではBC命令と呼ぶ）の演算実行
サイクルにおいて、そのサイクルの条件コードに
基き分岐成功か否かの判定が行われる。従つて、
分岐命令に先行する命令Ｄによる演算が条件コー
ドを変更する命令のときには、この先行命令Ｄの
演算が終了するまで分岐判定ができない。このよ
うに、先行する命令Ｄの演算の終了の後にBC命
令の分岐判定をしなければならない状態を条件コ
ードコンフリクトのある状態と呼ぶ。この場合、
BC命令の分岐判定は、命令Ｄの演算が終了する
まで遅延される。 The same problem occurs with conditional branch instructions.
In the operation execution cycle of this instruction (hereinafter referred to as BC instruction), it is determined whether the branch is successful or not based on the condition code of that cycle. Therefore,
When the operation by the instruction D preceding the branch instruction is an instruction that changes the condition code, the branch cannot be determined until the operation of the preceding instruction D is completed. In this way, a state in which a BC instruction branch decision must be made after the completion of the operation of the preceding instruction D is called a condition code conflict state. in this case,
Branch determination of the BC instruction is delayed until the operation of instruction D is completed.

条件コードコンフリクトに伴う、分岐判定の遅
延を少なくするために、上記と同様の階層型演算
方式を用いた先願特願昭56−194001および特願昭
58−62143号記載のコンピユータがあつた。これ
においても主演算装置においてのみ演算の実行で
きる命令Ｅの演算に要するサイクル数（N1）が
大きい場合に、この命令Ｅが主演算装置で実行さ
れている時に、従演算装置ではこの命令Ｅの演算
をN1より少ないサイクル数NSにてスキツプして
次の命令Ｄの演算を早期に実行し、この結果をさ
らに後続のBC命令の分岐判定に用いることによ
つて、分岐判定の遅延を少なくしていた。この先
願においても、上記アドレスコンフリクトに階層
型演算方式を適用した場合と同様に、N1がNSに
等しい程小さい場合や、命令Ｅが従演算装置にお
いても演算を行う必要のある場合は、命令Ｄの従
演算装置における演算を主演算に比して早期に行
うことは期待できない。 In order to reduce the delay in branch judgment due to condition code conflicts, the prior patent application No. 1986-194001 and
There was a computer described in No. 58-62143. In this case as well, if the number of cycles (N1) required for the operation of an instruction E that can be executed only in the main processing unit is large, when this instruction E is being executed in the main processing unit, the slave processing unit By skipping the operation in the number of cycles NS smaller than N1, executing the operation of the next instruction D early, and using this result in the branch judgment of the subsequent BC instruction, the delay in branch judgment can be reduced. was. In this prior application, as in the case where the hierarchical arithmetic method is applied to the above-mentioned address conflict, when N1 is small enough to be equal to NS, or when instruction E needs to perform an operation in the slave arithmetic unit, instruction D It cannot be expected that the operation in the slave operation unit will be performed earlier than the main operation.

[Purpose of the invention]

本発明の目的は従つて先願の階層型演算方式に
おける上述の問題点を解決し、もつてアドレスコ
ンフリクト、条件コードコンフリクトによる処理
の遅延がより少ない、従つてより高性能のパイプ
ライン制御のコンピユータを提供することにあ
る。 Therefore, an object of the present invention is to solve the above-mentioned problems in the hierarchical arithmetic method of the prior application, and to provide a pipeline-controlled computer with less processing delay due to address conflicts and condition code conflicts, and therefore with higher performance. Our goal is to provide the following.

[Summary of the invention]

先願の階層型演算方式においては、演算サイク
ル数の多い、従つて従演算装置では演算しない命
令を高速にスキツプすることによつて後続命令の
演算を早期に従演算装置にて行うものであつた
が、本発明においては、ある命令の従演算装置で
の演算サイクル数N2を、該命令の主演算装置で
の演算サイクル数N1に比較して短縮することに
よつて、また主演算装置においてのみ演算される
命令のうち最も演算サイクル数の短いものについ
ても、従演算装置において該命令をスキツプする
ために要するサイクル数を短縮することによつ
て、上述の先願の階層型演算方式の問題を解決す
るものである。 In the hierarchical arithmetic method of the earlier application, the operation of subsequent instructions is performed early in the slave arithmetic unit by quickly skipping instructions that require a large number of arithmetic cycles and therefore cannot be operated on the slave arithmetic unit. However, in the present invention, by reducing the number N2 of arithmetic cycles in the slave arithmetic unit of a certain instruction compared to the number N1 of arithmetic cycles in the main arithmetic unit of the instruction, By shortening the number of cycles required for skipping the instruction in the slave arithmetic unit, even for the instruction that has the shortest number of operation cycles among the instructions that are only operated on, the above-mentioned problem of the hierarchical arithmetic method of the earlier application can be solved. This is to solve the problem.

主演算装置に対して従演算装置をより高速に動
作させることは、後者をより高速の回路にて構成
すること、あるいは、さらに後者の論理規模が小
さいことに着目して後者を高集積度実装系を用い
て構成することによつて実現可能である。 In order to make the slave arithmetic unit operate faster than the main arithmetic unit, it is necessary to configure the latter with a faster circuit, or to implement the latter with a high degree of integration, focusing on the fact that the logic scale of the latter is small. This can be realized by configuring it using a system.

動作ピツチの異る２つの演算装置を含むデータ
処理装置を構成する場合には、各ピツチの公約数
をピツチとするタイミングパルスによつて全体を
同期化することが容易な方法であり、さらに緒々
の制御信号の信号幅をすべて同一とすることが論
理構成を容易とする。具体例として、主演算装置
が２サイクルピツチ、従演算装置を含むその他の
部分が１サイクルピツチで命令を処理できるデー
タ処理装置においては、全体を１サイクルピツチ
のタイミングパルスを用いて同期化し、しかも
緒々の制御信号の信号幅をすべて１サイクルとし
て構成することがよい。従つて主演算装置におけ
る演算終了信号は１回の発行において、１サイク
ルの間オンであり、次の１サイクルはオフとなる
よう構成される。 When configuring a data processing device that includes two arithmetic units with different operating pitches, it is easy to synchronize the entire system using timing pulses whose pitch is a common divisor of each pitch. Setting the signal widths of all the control signals to be the same facilitates the logic configuration. As a specific example, in a data processing device where the main processing unit can process instructions at a 2-cycle pitch, and the other parts including the slave processing units can process instructions at a 1-cycle pitch, the entire unit can be synchronized using timing pulses at a 1-cycle pitch, and It is preferable to configure the signal width of all the control signals at one time to be one cycle. Therefore, the operation end signal in the main processing unit is configured to be on for one cycle and off for the next cycle when issued once.

[Embodiments of the invention]

〔装置の概要〕６Ａ，６Ｂは命令バツフア、８はこれらの命令
バツフアの一方から命令を選択的に読出すための
読出し回路、１０は命令レジスタ、１２は命令デ
コーダ、５００はデコードされた情報を受け取る
レジスタ、１４は命令デコーダ１２からの解読情
報を格納するための命令キユーレジスタであり、
１４はここでは３つの命令の解読情報を格納する
ために３つのレジスタからなる。１６Ａは命令キ
ユーレジスタ１４内の命令を選択するためのセレ
クタであり、２０Ａはセレクタ１６Ａで選択され
た命令の演算を実行するための第１演算実行ユニ
ツトである（以下第1Eユニツトと略する）。この
ユニツト２０Ａはこのコンピユータが実行するす
べての命令が指定する演算を実行できるように構
成されている。１８Ａは、第1Eユニツト２０Ａ
に供給するオペランドあるいはそこからの演算結
果を格納するための第１の汎用レジスタ（これは
実際には複数の汎用レジスタからなるレジスタ群
である）であり、２２Ａは、命令の実行を制御す
る第１の命令制御ユニツト（以下第1Iユニツトと
呼ぶ）である。 [Overview of the device] 6A and 6B are instruction buffers, 8 is a reading circuit for selectively reading instructions from one of these instruction buffers, 10 is an instruction register, 12 is an instruction decoder, and 500 is a device for reading decoded information. The receiving register 14 is an instruction queue register for storing decoding information from the instruction decoder 12;
14 here consists of three registers for storing decoding information of three instructions. 16A is a selector for selecting an instruction in the instruction queue register 14, and 20A is a first operation execution unit (hereinafter abbreviated as 1E unit) for executing the operation of the instruction selected by the selector 16A. This unit 20A is configured to be able to execute operations specified by all instructions executed by this computer. 18A is the 1st E unit 20A
22A is the first general-purpose register (this is actually a register group consisting of a plurality of general-purpose registers) for storing the operand supplied to the instruction or the operation result therefrom, and 22A is the first general-purpose register for storing the operand supplied to the 1 command control unit (hereinafter referred to as the 1st I unit).

２４は演算に必要なメモリオペランドのアドレ
スを算出するためのアドレス加算器であり、２６
はメインメモリ、もしくはその一部の写しを記憶
するキヤツシユ記憶、２８は２６から読出された
メモリオペランドを格納するオペランドキユーバ
ツフアであり、命令キユーレジスタ１４に格納さ
れる三つの命令に対応する三つのオペランドを格
納するための三つのレジスタからなる。セレクタ
３０Ａはバツフア２８のオペランドを選択して第
1Eユニツト２０Ａに供給するためのセレクタで
ある。以上の回路は、パイプライン方式により命
令の実行を行うのに基本的なものである。 24 is an address adder for calculating the address of the memory operand necessary for the operation;
28 is a cache memory that stores a copy of the main memory or a part thereof; 28 is an operand queue buffer that stores the memory operands read from 26; It consists of three registers to store two operands. Selector 30A selects the operand of buffer 28 and selects the operand of buffer 28.
This is a selector for supplying to the 1E unit 20A. The circuit described above is basic for executing instructions using a pipeline system.

本実施例では以上の他に、第２演算実行ユニツ
ト２０Ｂ（以下、第2Eユニツトと呼ぶ）と、この
ユニツト２０Ｂが実行すべき命令の解読情報を命
令キユーレジスタ１４から選択するためのセレク
タ１６Ｂと、第2Eユニツト２０Ｂに送るべきオ
ペランドあるいはこのユニツトの演算結果を格納
するための第２の汎用レジスタ１８Ｂ（これも実
際には複数の汎用レジスタからなるレジスタ群で
ある）と、第2Eユニツト２０Ｂが必要とするメ
モリオペランドをオペランドキユーバツフア２８
から選択するためのセレクタ３０Ｂと、第2Eユ
ニツト２０Ｂの実行を制御する第２命令制御ユニ
ツト（以下単に第２ユニツトと呼ぶ）２２Ｂ
と、第１、第2Eユニツト２０Ａ，２０Ｂから出
力される条件コードを記憶するためのレジスタ３
４Ａ，３４Ｂと、パイプライン的に流れて実行さ
れる複数の命令間のアドレスコンフリクトおよび
条件コードコンフリクトをそれぞれ検出する回路
３２，３６が設けられている。 In this embodiment, in addition to the above, a second arithmetic execution unit 20B (hereinafter referred to as the second E unit), a selector 16B for selecting decoding information of an instruction to be executed by this unit 20B from the instruction queue register 14, A second general-purpose register 18B (this is actually a register group consisting of multiple general-purpose registers) and a second general-purpose register 18B for storing the operand to be sent to the second E unit 20B or the operation result of this unit, and the second E unit 20B are required. The memory operand with the operand buffer 28
a selector 30B for selecting from 2E, and a second instruction control unit (hereinafter simply referred to as the second unit) 22B that controls the execution of the second E unit 20B.
and a register 3 for storing condition codes output from the first and second E units 20A and 20B.
4A and 34B, and circuits 32 and 36 that respectively detect address conflicts and condition code conflicts between a plurality of instructions executed in a pipeline manner are provided.

第2Eユニツト２０Ｂは比較的簡単な演算、た
とえば加算、減算等を行う。本実施例では、この
ユニツトは簡単化のために１マシンサイクルで終
了する演算のみを実行するように構成されてい
る。しかし、以下の説明からも明らかになるごと
く、本発明はこのような第2Eユニツトのみに限
られるものでない。第2Eユニツト２０Ｂは比較
的簡単な演算を高速に実行するためのものである
から、第1Eユニツト２０Ａよりも、第２の汎用
レジスタ１８Ｂとオペランドキユーバツフア２８
の近くに設けることが望ましい。また、第１、第
２の汎用レジスタ１８Ａ，１８Ｂは互いに同じ数
のレジスタからなる。 The second E unit 20B performs relatively simple operations such as addition and subtraction. In this embodiment, this unit is configured, for simplicity, to only perform operations that complete in one machine cycle. However, as will become clear from the following description, the present invention is not limited to such a second E unit. Since the second E unit 20B is for executing relatively simple operations at high speed, it has a second general-purpose register 18B and an operand queue buffer 28 rather than the first E unit 20A.
It is desirable to install it near the Further, the first and second general-purpose registers 18A and 18B are composed of the same number of registers.

この装置においては、全命令は第1Eユニツト
２０Ａにより実行される。一方、これらの命令の
内汎用レジスタ１８Ａを書きかえる命令で第2E
ユニツト２０Ｂで実行可能な命令はこの第2Eユ
ニツト２０Ｂでも実行される。したがつて、この
ような命令は第１、第2Eユニツト２０Ａ，２０
Ｂの両方で実行される。しかし、第１、第2Eユ
ニツト２０Ａ，２０Ｂでの同じ命令に対する同じ
演算は異なるタイミングで起動可能になつてい
る。すなわち、第１、第2Eユニツト２０Ａ，２
０Ｂがそれぞれ必要とする解読情報を別々に供給
するために二つのセレクタ１６Ａ，１６Ｂが設け
られ、また、これらのユニツト２０Ａ，２０Ｂが
それぞれ必要とするメモリオペランドを別々に供
給するために二つのセレクタ３０Ａ，３０Ｂが設
けられ、第1Eユニツト２０Ａ、セレクタ１６Ａ，
３０Ａは第1Iユニツト２２Ａにより制御され、第
2Eユニツト２０Ｂ、セレクタ１６Ｂ，３０Ｂは
第2Iユニツト２２Ｂにより制御されるようになつ
ている。この結果、ある命令の第1Eユニツト２
０Ａでの実行ができない状態においても、その命
令の第2Eユニツト２０Ｂでの実行が可能となり、
これにより、アドレスコンフリクト又は条件コー
ドコンフリクトの解消が早められるようになつて
いる。 In this device, all instructions are executed by the first E unit 20A. On the other hand, among these instructions, the 2nd E is an instruction that rewrites general-purpose register 18A.
Instructions that can be executed by unit 20B are also executed by this second E unit 20B. Therefore, such a command is issued to the first and second E units 20A and 20.
It is executed in both B. However, the same operation for the same instruction in the first and second E units 20A and 20B can be activated at different timings. That is, the first and second E units 20A, 2
Two selectors 16A, 16B are provided to separately supply the decoding information required by the 0Bs, and two selectors 16A, 16B are provided to separately supply the memory operands each of these units 20A, 20B requires. 30A, 30B are provided, the 1st E unit 20A, the selector 16A,
30A is controlled by the first I unit 22A,
The 2E unit 20B and selectors 16B and 30B are controlled by the 2nd I unit 22B. As a result, the 1st E unit 2 of a certain instruction
Even in a state where the instruction cannot be executed in 0A, the instruction can be executed in the 2E unit 20B.
This speeds up resolution of address conflicts or condition code conflicts.

[Instruction format]

本装置で用いる命令は、（株）日立製作所Ｍシ
リーズコンピユータあるいは米国IBM社370シリ
ーズコンピユータで用いられるのと同じものであ
る。これらの命令はいくつかのフオーマツトに分
類されるが、第３Ａ図に本発明の理解に必要な一
つのフオーマツトを示す。このフオーマツトを持
つ命令の例は、加算、ロード命令で、それぞれ、
簡単のためにＡ，Ｌ命令と以下では称する。これ
らの命令のビツト０−７はオペレーシヨンコード
（OPCODE）を表し、ビツト８−11はレジスタ部
（Ｒ）であり、Ａ命令では演算のために読出すべ
きオペランドが格納されている汎用レジスタ番号
を示し、かつ、演算結果が格納されるレジスタ番
号をも示している。また、Ｌ命令ではこのレジス
タ部は、演算結果を格納すべきレジスタ番号を示
す。ビツト12−15および16−19はそれぞれインデ
ツクス部（Ｘ）、ベース部（Ｂ）であり、それぞ
れメインメモリ２６から読出すべきストレジオペ
ランドのアドレス算出に用いる汎用レジスタの番
号を示す。以下ではレジスタ部、インデツクス
部、ベース部が示す汎用レジスタ番号をそれぞれ
オペランドレジスタ番号R_OP、インデツクスレジ
スタ番号R_X、ベースレジスタ番号R_Bと呼ぶこと
にする。さらに、ビツト20−31は、上記アドレス
算出に用いられる偏立値DISPを示すデイスプレ
ースメント部（Ｄ）である。 The instructions used in this device are the same as those used in Hitachi, Ltd.'s M series computers or IBM's 370 series computers. These instructions are classified into several formats, and FIG. 3A shows one format necessary for understanding the present invention. Examples of instructions with this format are add and load instructions, respectively.
For simplicity, these will be referred to as A and L instructions below. Bits 0-7 of these instructions represent the operation code (OPCODE), bits 8-11 are the register part (R), and in the A instruction, the general-purpose register number where the operand to be read for the operation is stored. It also shows the register number in which the calculation result is stored. Further, in the L instruction, this register section indicates the register number in which the operation result is to be stored. Bits 12-15 and 16-19 are an index portion (X) and a base portion (B), respectively, and indicate the number of a general-purpose register used to calculate the address of a storage operand to be read from main memory 26, respectively. In the following, the general-purpose register numbers indicated by the register section, index section, and base section will be referred to as operand register number R _OP , index register number R _X , and base register number R _B , respectively. Furthermore, bits 20-31 are a displacement part (D) indicating the eccentricity value DISP used in the address calculation.

第３Ｂ図はBC命令の命令フオーマツトを示す。
このフオーマツトのビツト８−11はマスク部
（Ｍ）であり、分岐成功となる条件コードの値を
指定するものである。 FIG. 3B shows the instruction format of the BC instruction.
Bits 8-11 of this format are a mask part (M), which specifies the value of the condition code that will result in a successful branch.

BC命令のインデツクス部（Ｘ）、ベース部
（Ｂ）、偏位置DISPは、メインメモリ２６から読
み出すべき分岐先命令のアドレス計算に用いられ
る。 The index part (X), base part (B), and offset position DISP of the BC instruction are used to calculate the address of the branch destination instruction to be read from the main memory 26.

以下の実施例においてはＡ，Ｌ，BC命令のみ
を用いて説明する。この内、第2Eユニツト２０
Ｂで実行可能な命令はＡ，Ｌ命令であり、また、
条件コードを書きかえる命令はＡ命令である。 The following embodiment will be explained using only A, L, and BC instructions. Of these, 2nd E unit 20
The instructions that can be executed by B are A and L instructions, and
The instruction to rewrite the condition code is the A instruction.

[Reading instructions]

命令バツフア６Ａ，６Ｂの一方は、現在処理さ
れようとしている命令を含む、メモリ上の連続し
た命令列（メインストリームと呼ぶ）を格納する
ために用いられ、他方は、メインストリーム上の
分岐命令の処理が開始された後にその分岐先命令
から始まるメモリ上の連続した命令列（ターゲツ
トストリームと呼ぶ）を格納するのに用いられ
る。処理された分岐命令が分岐成功と判定され、
分岐先命令の処理が始まると、それまでターゲツ
トストリームとされていた命令列はその後はメイ
ンストリームとみなされる。このように分岐命令
が分岐成功となる度にメインストリームを格納す
る命令バツフアは切り換わる。メインメモリ２６
からの命令列のフエツチおよびバツフア６Ａ，６
Ｂへの命令列の格納の制御は命令フエツチ回路
（図示せず）により実施される。読出し回路８は
バツフア６Ａ，６Ｂの内、現在メインストリーム
を格納しているものから、命令を順次選択して順
次命令レジスタ１０に送出する。この読出し回路
８はフリツプフロツプ（以下では、フリツプフロ
ツプのことをFFと略記することがある）９の値
が１か０によりバツフア６Ａ，６Ｂを選択する。
このフリツプフロツプ９は、後述するように分岐
成功信号BCTKNが１となるごとにその値をかえ
る。 One of the instruction buffers 6A and 6B is used to store a continuous instruction sequence in memory (referred to as the main stream) including the instruction that is currently being processed, and the other is used to store branch instructions on the main stream. It is used to store a continuous instruction sequence (called a target stream) in memory starting from the branch destination instruction after processing has started. The processed branch instruction is determined to be a successful branch,
When processing of a branch destination instruction begins, the instruction sequence that was previously considered to be the target stream is then considered to be the main stream. In this way, the instruction buffer storing the main stream is switched every time the branch instruction is successful. Main memory 26
Fetch and buffer the instruction sequence from 6A, 6
Control of storing the instruction sequence into B is performed by an instruction fetch circuit (not shown). The readout circuit 8 sequentially selects instructions from the buffers 6A and 6B that currently store the main stream, and sequentially sends them to the instruction register 10. This readout circuit 8 selects the buffers 6A and 6B depending on whether the value of a flip-flop (hereinafter, a flip-flop may be abbreviated as FF) 9 is 1 or 0.
This flip-flop 9 changes its value every time the branch success signal BCTKN becomes 1, as will be described later.

命令レジスタ１０には、後述するように、命令
レジスタ１０内の命令のデコードステージが実行
されるごとに、制御回路（図示せず）により、読
出し回路８が出力する命令がセツトされる。 As will be described later, an instruction output from the readout circuit 8 is set in the instruction register 10 by a control circuit (not shown) each time the decoding stage of the instruction in the instruction register 10 is executed.

（命令実行動作の概要）階層型演算方式を適用したコンピユータの例と
して、先願特願昭58−62143号に掲げたものを考
えると、それにおけるLoad命令の実行ステージ
は第９Ｄ図のようになる。すなわち、命令のデコ
ード及びアドレス計算のためのＤステージ、オペ
ランドあるいは分岐先命令の読み出しのためのＡ
ステージ、第１、第2Eユニツトへのセツトアツ
プのためのL1，L2ステージ、第１、第2Eユニツ
トにおける演算のためのE1，E2ステージ、第１、
第２汎用レジスタへの演算結果の書き込みのため
のP1，P2ステージからなる。上記先願において、
上記の各ステージはそれぞれ１マシンサイクルを
要する。(Summary of Instruction Execution Operation) Considering the computer listed in Prior Patent Application No. 1983-62143 as an example of a computer to which the hierarchical arithmetic method is applied, the execution stage of the Load instruction in that computer is as shown in Figure 9D. Become. That is, the D stage is for instruction decoding and address calculation, and the A stage is for reading operands or branch destination instructions.
stages, L1 and L2 stages for setup to the first and second E units, E1 and E2 stages for calculations in the first and second E units, the first,
It consists of P1 and P2 stages for writing operation results to the second general-purpose register. In the above earlier application,
Each of the above stages requires one machine cycle.

一方、本発明の実施例におけるLoad命令の実
行ステージは第９Ｅ図のようになる。すなわち命
令のデコード及びアドレス計算のためのＤ，
D′ステージ、オペランドあるいは分岐先命令の
読み出しのためのＡ，A′ステージ、第１、第2E
ユニツトへのセツトアツプのためのL1，L2ステ
ージ、第１、第2Eユニツトにおける演算のため
のE1，E2ステージ、第１、第２汎用レジスタへ
の演算結果の書き込みのためのP1，P2ステージ
からなる。すなわち上記先願におけるＤステージ
での処理が本実施例におけるＤ，D′ステージで
の処理に、またＡステージがＡ，A′ステージに
対応する。ここで本実施例においては上記先願に
おけるタイミングパルスT0の1/2のピツチのタイ
ミングパルスT0′を用いることとし、従つて、マ
シンサイクルは1/2としたものとする。従つて本
実施例において、Ｄ，D′，Ａ，A′，L2，E2，P2
の各ステージはそれぞれ１マシンサイクルずつ要
し、L1，E1，P1は２マシンサイクルずつ要す
る。 On the other hand, the execution stage of the Load instruction in the embodiment of the present invention is as shown in FIG. 9E. That is, D for instruction decoding and address calculation,
D' stage, A for reading operand or branch target instruction, A' stage, 1st, 2nd E
Consists of L1 and L2 stages for setup to the unit, E1 and E2 stages for operations in the first and second E units, and P1 and P2 stages for writing operation results to the first and second general-purpose registers. . That is, the processing at the D stage in the prior application corresponds to the processing at the D and D' stages in this embodiment, and the A stage corresponds to the A and A' stages. Here, in this embodiment, a timing pulse T0' having a pitch of 1/2 of the timing pulse T0 in the above-mentioned prior application is used, and therefore, the machine cycle is set to 1/2. Therefore, in this example, D, D', A, A', L2, E2, P2
Each stage requires one machine cycle, and L1, E1, and P1 require two machine cycles each.

ここで注意すべきことは、上記先願と本実施例
とではマシンサイクルは後者が1/2と短縮されて
いるものの、命令デコード及びアドレス計算のス
テージ（ＤとＤ，D′）、オペランド等の読み出し
ステージ（ＡとＡ，A′）、第1Eユニツトへのセツ
トアツプステージ（それぞれL1）、第1Eユニツト
での演算ステージ（それぞれE1）第１汎用レジ
スタへの演算結果への書き込みステージ（それぞ
れP1）については、実時間は同じである。従つ
てこれらの論理回路の構成にあたつては、先願に
おいても本実施例においても、同等レベルの回
路、実装技術を用いて実現することが可能と言え
る。 What should be noted here is that although the machine cycles of the above-mentioned prior application and this embodiment are shortened to 1/2 in the latter case, the stages of instruction decoding and address calculation (D, D, D'), operands, etc. The read stage (A and A, A'), the setup stage to the 1st E unit (L1, respectively), the arithmetic stage in the 1st E unit (E1, respectively), the write stage of the operation result to the 1st general-purpose register (respectively) Regarding P1), the real time is the same. Therefore, it can be said that the configuration of these logic circuits can be realized using circuits and mounting techniques of the same level in both the prior application and this embodiment.

一方、本実施例において、L2，E2，P2の各ス
テージは１マシンサイクルずつで実行可能とする
ため、先願におけるL2，E2，P2ステージに比較
して、1/2の実時間で処理することとなる。これ
は、L2，E2，P2の実行に係る回路が、ほぼ、第
2Iユニツト、第2Eユニツト、第２汎用レジスタ
に限られるため、それらの論理規模が装置全体に
占める割合が小さく、従つて全体の中の極く限ら
れた空間に配置することが可能であることゆえ実
現できる。あるいは、これら３つの論理のみ高性
能の回路、実装技術を用いて構成することによつ
ても実現できる。この場合でも、この高性能の回
路、実装技術が周辺論理に比較して高価であつて
も、全体に占める割合が小さいことから、装置全
体の価格上昇への影響は少なくすることが可能で
ある。 On the other hand, in this embodiment, each stage of L2, E2, and P2 can be executed in one machine cycle, so the processing time is 1/2 of the actual time compared to the stages of L2, E2, and P2 in the previous application. That will happen. This means that the circuits related to the execution of L2, E2, and P2 are
Since it is limited to the 2I unit, 2E unit, and 2nd general-purpose register, their logical scale occupies a small proportion of the entire device, and therefore it is possible to place them in an extremely limited space within the entire device. Therefore, it can be realized. Alternatively, it can be realized by configuring only these three logics using high-performance circuits and mounting technology. Even in this case, even if this high-performance circuit and mounting technology is more expensive than the peripheral logic, it only makes up a small percentage of the total, so it is possible to minimize the impact on the overall price increase of the device. .

命令レジスタ１０にセツトされた命令は、次の
ステージに分けて実行される。 The instructions set in the instruction register 10 are executed in the following stages.

（Ｄ，D′ステージ）これは命令の解読とアドレス計算のステージで
ある。すなわち、命令レジスタ１０内の命令をデ
コーダ１２で解読し解読情報を生成し、これを命
令キユーレジスタ１４内の一つのレジスタにセツ
トする。さらに、命令レジスタ１０内の命令のベ
ースレジスタ番号R_B、インデツクスレジスタ番
号R_Xが第２汎用レジスタ、１８Ｂに線４０を介
して入力される。これらのレジスタ番号に基づい
て汎用レジスタ１８Ｂから読出された二つのデー
タは線４７を介してアドレス加算器２４に入力さ
れ、線４０を介して命令レジスタ１０から入力さ
れる偏位値DISPと加算される。こうしてアドレ
スが計算される。(D, D' stage) This is the stage of instruction decoding and address calculation. That is, the instruction in the instruction register 10 is decoded by the decoder 12 to generate decoding information, and this is set in one register in the instruction queue register 14. Additionally, the base register number R _B and index register number _R The two data read from the general register 18B based on these register numbers are input to the address adder 24 via line 47, and are added to the deviation value DISP input from the instruction register 10 via line 40. Ru. The address is thus calculated.

なお、本実施例においてデコーダ１２は、命令
のオペレーシヨンコードOP CODEとオペランド
レジスタ番号R_OPに基づき、次の情報を解読情報
として線４２に出力する。 In this embodiment, the decoder 12 outputs the following information to the line 42 as decoding information based on the instruction operation code OP CODE and the operand register number R _OP .

(1) OP CODE (2) R_R：オペランド読出しレジスタ番号で、上
述のＡ命令ではオペランドレジスタ番号
R_OPに等しい。(1) OP CODE (2) R _R : Operand read register number, in the above A instruction, the operand register number
Equals R _OP .

(3) R_W：演算結果を書込むレジスタ番号で、上
述２つの命令ではオペランドレジスタ番号
R_OPに等しい。(3) R _W : Register number where the operation result is written; in the above two instructions, it is the operand register number.
Equals R _OP .

(4) SUBGR：演算結果を汎用レジスタに格納す
べき命令の内、第2Eユニツト２０Ｂで実
行可能な命令かどうかを示す第２演算表示
信号。たとえば、Ａ命令、Ｌ命令のときに
この信号が１となる。(4) SUBGR: Second operation display signal indicating whether or not the instruction whose operation result should be stored in the general-purpose register can be executed by the second E unit 20B. For example, this signal becomes 1 in the case of an A command or an L command.

(5) CHGGR：演算結果を汎用レジスタに格納す
べき命令であることを示すレジスタ変更表
示信号。たとえば、Ａ，Ｌ命令のときにこ
の信号が１となる。(5) CHGGR: Register change display signal indicating that the instruction should store the operation result in a general-purpose register. For example, this signal becomes 1 for A and L commands.

(6) SUBCC：条件コードを変更する命令の内、
第2Eユニツト２０Ｂにおいて実行可能な
命令であることを示す第２演算表示信号。
たとえばＡ命令のときにこの信号が１とな
る。(6) SUBCC: Of the instructions that change the condition code,
A second operation display signal indicating that the instruction is executable in the second E unit 20B.
For example, this signal becomes 1 in the case of an A command.

(7) CHGCC：条件コードを変更する命令である
ことを示すCC変更表示信号。たとえば、
Ａ命令のときにこの信号は１となる。(7) CHGCC: CC change display signal indicating that the instruction changes the condition code. for example,
This signal becomes 1 at the time of the A command.

(8) BC：BC命令であることを示すBC命令表示
信号。(8) BC: BC command display signal indicating BC command.

(9) MASK：BC命令のマスク部そのもの。(9) MASK: The mask part itself of the BC instruction.

なお、命令デコーダ１２はこれら以外の、命令
実行制御に必要な解読情報を生成するが、これは
従来技術と同じであり、また本発明に直接関係が
ないため説明を省く。 Note that the instruction decoder 12 generates other decoding information necessary for instruction execution control, but this is the same as the conventional technology and is not directly related to the present invention, so a description thereof will be omitted.

なお、このＤ，D′ステージはそれぞれ１マシ
ンサイクルで終了する。 Note that the D and D' stages are each completed in one machine cycle.

（Ａ，A′ステージ）Ｄ，D′ステージで計算されたメモリアドレス
に基づき、メインメモリ２６からオペランドを読
出し、オペランドキユーバツフア２８の一つのレ
ジスタに格納する。このＡ，A′ステージもそれ
ぞれ１マシンサイクルで終了する。(A, A' stage) Based on the memory address calculated in the D, D' stage, an operand is read from the main memory 26 and stored in one register of the operand buffer 28. These A and A' stages are also completed in one machine cycle each.

（L1ステージ）第1Eユニツト２０Ａに命令の解読情報と演算
に必要なレジスタオペランドとメモリオペランド
をセツトする。すなわち、セレクタ１６Ａにより
実行すべき命令の解読情報を選択して、線４４Ａ
に出力する。選択された解読情報の内、読出しレ
ジスタ番号R_Rは第１汎用レジスタ１８Ａに入力
され、これに基づき必要なオペランドRDATA1
が読出され、このオペランドが線４６Ａを介して
第1Eユニツト２０Ａに送られる。一方、解読情
報の内、OP CODE、書込みレジスタ番号R_W、
第２演算表示信号SUBGR、レジスタ変更表示信
号CHGGR、条件コード変更表示信号CHGCC、
は第1Eユニツト２０Ａに送出される。さらに、
セレクタ３０Ａは、オペランドキユーバツフア２
８から必要なメモリオペランドMDATA1を選択
して線４５Ａを介して第1Eユニツト２０Ａに送
出する。第1Eユニツト２０Ａはこうして入力さ
れたいろいろのデータを内部のレジスタに取り込
む。このステージは２マシンサイクルで実行され
る。(L1 stage) Instruction decoding information and register operands and memory operands necessary for the operation are set in the 1E unit 20A. That is, the selector 16A selects the decoding information of the instruction to be executed, and the line 44A
Output to. Among the selected decoding information, the read register number R _R is input to the first general-purpose register 18A, and based on this, the necessary operand RDATA1
is read and this operand is sent to the 1E unit 20A via line 46A. On the other hand, among the decoding information, OP CODE, write register number R _W ,
Second calculation display signal SUBGR, register change display signal CHGGR, condition code change display signal CHGCC,
is sent to the first E unit 20A. moreover,
The selector 30A is the operand queue buffer 2.
8 and sends it to the first E unit 20A via line 45A. The first E unit 20A takes in the various data thus input into its internal register. This stage is executed in two machine cycles.

（E1ステージ）第1Eユニツト２０ＡがL1ステージで取り込ま
れたデータに基づき所望の演算を実行し、結果
WDATA1を線５０Ａに出力する。このとき、同
時に、すぐに取り込まれた書込みレジスタ番号
R_W、第１汎用レジスタの書込み信号WC１が出力
される。(E1 stage) The 1st E unit 20A executes the desired calculation based on the data taken in at the L1 stage, and the result is
Output WDATA1 to line 50A. At this time, at the same time, the write register number that was immediately captured
R _W , a write signal WC1 of the first general-purpose register is output.

また、演算が条件コードを変更すべきもののと
きには演算結果に依存して条件コードCC１を算
出し、このコードCC１とセツト指示信号SET１
が線７０Ａに出力される。 Furthermore, when the operation is to change the condition code, the condition code CC1 is calculated depending on the operation result, and this code CC1 and the set instruction signal SET1 are
is output on line 70A.

なお、このステージが要するマシンサイクル数
は２マシンサイクルである。命令の種類により、
２ステージ以上E1ステージを要するものがある。
従つて、最も短かい演算は２マシンサイクルで終
了し、長いものは２の倍数マシンサイクルにて終
了する。なお、このE1ステージに属するステー
ジの最後のステージの前半のマシンサイクルには
演算終了信号EOP１が４８Ａに出力される。し
たがつて、最短の２マシンサイクルで終了する演
算の場合、演算を開始したサイクルで信号EOP
１が出力される。 Note that the number of machine cycles required for this stage is 2 machine cycles. Depending on the type of instruction,
There are some that require two or more E1 stages.
Therefore, the shortest operation is completed in two machine cycles, and the longest one is completed in multiples of two machine cycles. Incidentally, in the first half machine cycle of the last stage belonging to this E1 stage, an operation end signal EOP1 is outputted to 48A. Therefore, in the case of an operation that completes in the shortest two machine cycles, the signal EOP is output in the cycle in which the operation is started.
1 is output.

（P1ステージ） E1ステージで求められた演算結果WDATA1
は、書込みレジスタ番号R_Wで示される、第１汎
用レジスタ１８Ａ内のレジスタに書き込まれる。
この演算が第2Eユニツト２０Ｂで実行できない
演算のときには、演算結果WDATA1、書込みレ
ジスタ番号R_Wが、第２汎用レジスタ１８Ｂにも
入力され、同様の書込みが行なわれる。第２汎用
レジスタ１８Ｂはすでに述べたごとく、第2Eユ
ニツト２０Ｂによる演算結果WDATA2を書込む
ようになつているが、第2Eユニツト２０Ｂは一
部の演算しか実行できない。したがつて、第１、
第２の汎用レジスタ１８Ａ，１８Ｂの内容の不一
致が生じ得る。第1Eユニツト２０Ａによる演算
結果WDATA1を第２汎用レジスタ１８Ｂにも書
込むことにより、この不一致の発生を防止でき
る。したがつて、第２汎用レジスタを前述のアド
レス計算に用いても不都合が生じない。また、条
件コードを変更する演算のときには、算出された
条件コードCC１がセツト信号SET１に応答して
レジスタ３４Ａにセツトされる。なお、このP1
ステージはE1ステージにつづく２マシンサイク
ルで終了する。(P1 stage) Operation result WDATA1 obtained at E1 stage
is written to a register within the first general purpose register 18A, indicated by write register number _RW .
When this operation cannot be executed by the second E unit 20B, the operation result WDATA1 and the write register number _RW are also input to the second general-purpose register 18B, and a similar write is performed. As already mentioned, the second general-purpose register 18B is designed to write the operation result WDATA2 by the second E unit 20B, but the second E unit 20B can only execute some operations. Therefore, first,
A mismatch in the contents of the second general purpose registers 18A, 18B may occur. This mismatch can be prevented by writing the calculation result WDATA1 by the first E unit 20A also to the second general-purpose register 18B. Therefore, no inconvenience occurs even if the second general-purpose register is used for the above-mentioned address calculation. Further, when performing an operation to change the condition code, the calculated condition code CC1 is set in the register 34A in response to the set signal SET1. Furthermore, this P1
The stage ends in two machine cycles following the E1 stage.

以上のようにして、一つの命令の実行が終了す
る。しかし、第2Eユニツト２０Ｂで実行可能で
かつ汎用レジスタもしくは条件コードを書きかえ
る命令は第2Eユニツト２０Ｂでも実行されるた
めこれらのステージで説明される動作以外に次の
ステージも実行される。 In this manner, execution of one instruction is completed. However, since an instruction that can be executed in the second E unit 20B and that rewrites a general-purpose register or condition code is also executed in the second E unit 20B, the next stage is also executed in addition to the operations described in these stages.

（L2ステージ）第2Eユニツト２０Ｂに演算に必要な解読情報、
レジスタオペランド、メモリオペランドをセツト
する。すなわち、セレクタ１６Ｂにより実行すべ
き命令の解読情報を選択して、線４４Ｂに出力す
る。選択された解読情報の内、読出しレジスタ番
号R_Rは第２汎用レジスタ１８Ｂに入力され、こ
れに基づき必要なオペランドRDATA2が読出さ
れ、このオペランドが線４６Ｂを介して第2Eユ
ニツト２０Ｂに送られる。一方、解読情報の内、
OP CODE、書込みレジスタ番号R_W、第２演算
表示信号SUBGR、SUBCC、条件コード変更表
示信号CHGCCは、第2Eユニツト２０Ｂに送出さ
れる。さらに、セレクタ３０Ｂは、オペランドキ
ユーバツフア２８から必要なメモリオペランド
MDATA2を選択して線４５Ｂを介して第2Eユニ
ツト２０Ｂに送出する。第2Eユニツト２０Ｂは
こうして入力されたいろいろのデータを内部のレ
ジスタに取り込む。このステージは１マシンサイ
クルで実行される。(L2 stage) Deciphering information necessary for calculation in the 2nd E unit 20B,
Set register operands and memory operands. That is, the selector 16B selects the decoding information of the instruction to be executed and outputs it to the line 44B. Of the selected decoding information, the read register number R _R is input to the second general purpose register 18B, based on which the required operand RDATA2 is read and sent to the second E unit 20B via line 46B. On the other hand, among the decryption information,
The OP CODE, write register number R _W , second operation display signals SUBGR and SUBCC, and condition code change display signal CHGCC are sent to the second E unit 20B. Furthermore, the selector 30B selects a necessary memory operand from the operand buffer 28.
MDATA2 is selected and sent to the second E unit 20B via line 45B. The second E unit 20B takes in the various data thus input into its internal register. This stage is executed in one machine cycle.

（E2ステージ）第２演算ユニツト２０ＢがL2ステージで取り
込まれたデータに基づき所望の演算を実行し、結
果WDATA2を線５０Ｂに出力する。このとき、
同時に、すでに取り込まれた書込みレジスタ番号
R_Wも出力される。さらにこのとき、演算が条件
コードを書きかえる命令のときには条件コード
CC２が算出される。(E2 stage) The second arithmetic unit 20B executes a desired arithmetic operation based on the data taken in at the L2 stage, and outputs the result WDATA2 to the line 50B. At this time,
At the same time, the already captured write register number
_RW is also output. Furthermore, at this time, if the operation is an instruction that rewrites the condition code, the condition code
CC2 is calculated.

なお、第2Eユニツト２０Ｂは比較的簡単な演
算、たとえば、加算、減算、ロード等のみを１マ
シンサイクルで行なうもので、極く少ない回路素
子にて構成され、乗算等のやや複雑な演算は行え
ないものと仮定している。したがつて、このE2
ステージは１マシンサイクルで終了する。一般的
には、第2Eユニツト２０Ｂは以上のものに限定
されないので、このE2ステージは第2Eユニツト
２０Ｂが実行する演算により異なるマシンサイク
ルで終了すればよい。その場合、そのステージの
最終サイクルでは演算終了信号EOP２が線４８
Ｂに出力されるので、この最後のサイクルを
EOP２サイクルと呼ぶことができる。したがつ
て、現在、仮定しているように、第2Eユニツト
２０Ｂは必らず１マシンサイクルで終了する演算
のみを行う場合には、このE2ステージはEOP２
サイクルのみからなる。 Note that the second E unit 20B only performs relatively simple operations such as addition, subtraction, and loading in one machine cycle, and is configured with a very small number of circuit elements, so it cannot perform somewhat complex operations such as multiplication. It is assumed that there is no such thing. Therefore, this E2
A stage completes in one machine cycle. Generally, the second E unit 20B is not limited to the above, so the E2 stage may be completed in a different machine cycle depending on the operation executed by the second E unit 20B. In that case, in the final cycle of that stage, the operation end signal EOP2 is on line 48.
Since it is output to B, this last cycle is
It can be called EOP2 cycle. Therefore, if the second E unit 20B only performs operations that must be completed in one machine cycle, as currently assumed, then this E2 stage will be the EOP2 stage.
It consists only of cycles.

（P2ステージ） E2ステージで求められた演算結果WDATA2
は、書込みレジスタ番号R_Wで示される、第２汎
用レジスタ１８Ｂ内のレジスタに、書込まれる。
また、算出された条件コードCC２が線７０Ｂを
介してレジスタ３４Ｂにセツトされる。(P2 stage) Operation result WDATA2 obtained at E2 stage
is written to the register in second general purpose register 18B, indicated by write register number _RW .
Further, the calculated condition code CC2 is set in the register 34B via the line 70B.

[Details of operation I]

以下では、まず、アドレスコンフリクトがある
場合アドレスコンフリクトの場合の装置動作を通
して装置の詳細を説明する。装置の内、条件コー
ドに関与する部分とその動作の説明は後でまとめ
て説明する。 In the following, the details of the apparatus will first be explained through the operation of the apparatus when there is an address conflict. The parts of the device involved in the condition code and their operations will be explained together later.

以下では、アドレスコンフリクトがある時の装
置動作の説明のため第１のＬ命令Ｌ(1)、第２のＬ
命令Ｌ(2)、Ａ、第３のＬ命令Ｌ(3)がこの順に実行
されると仮定する。また、Ａ命令のオペランドレ
ジスタ番号R_OPとＬ(3)命令のアドレスレジスタ番
号R_X又はR_Bが等しく、したがつてこれらの二つ
の命令間にアドレスコンフフリクトが生じると仮
定する。ただし、Ｌ(1)，Ｌ(2)，Ａの各命令間、Ｌ
(1)，Ｌ(2)とＬ(3)の間あるいはＬ(1)，Ｌ(2)，Ａ，Ｌ
(3)命令とこれらに先行する命令間にはアドレスコ
ンフリクトがないものと仮定する。また、条件コ
ードコンフリクトはどの命令間にも存在しないと
仮定する。 Below, in order to explain the device operation when there is an address conflict, the first L instruction L(1) and the second L instruction L(1) will be explained.
Assume that instructions L(2), A, and the third L instruction L(3) are executed in this order. Further, it is assumed that the operand register number R _OP of the A instruction and the address register number R _X or R _B of the L(3) instruction are equal, and therefore an address conflict occurs between these two instructions. However, between each instruction L(1), L(2), and A,
(1), between L(2) and L(3) or L(1), L(2), A, L
(3) Assume that there is no address conflict between instructions and the instructions preceding them. It is also assumed that no condition code conflicts exist between any instructions.

９Ａ〜９Ｃのタイムチヤートを参照して説明す
る。 This will be explained with reference to time charts 9A to 9C.

（Ｄ，D′ステージの詳細）Ｄ，D′ステージが実行されるためには、命令
の解読情報が命令キユーレジスタ１４にセツトさ
れねばならない。この命令キユーレジスタ１４へ
のセツトは第1Iユニツト２２Ａが制御する。この
制御の仕方は、先願によるものと本質的に変わら
ない。すなわち、命令キユーレジスタ１４に新た
に解読情報がセツトできるのは、ある命令が命令
レジスタ１０にセツトされ、しかも命令キユーレ
ジスタ１４に空きのレジスタがあり、さらに、そ
の命令が命令キユーレジスタ１４内にすでにセツ
トされている命令もしくはすでに実行中の命令に
対してアドレスコンフリクトや条件コードコンフ
リクトがない場合である。(Details of D and D' stages) In order to execute the D and D' stages, instruction decoding information must be set in the instruction queue register 14. This setting to the instruction queue register 14 is controlled by the first I unit 22A. This method of control is essentially the same as that of the previous application. That is, new decoding information can be set in the instruction queue register 14 only if a certain instruction is set in the instruction register 10, there is an empty register in the instruction queue register 14, and the instruction has already been set in the instruction queue register 14. This is the case when there is no address conflict or condition code conflict with the instruction currently being executed or the instruction that is already being executed.

アドレスコンフリクトの検出のために、命令レ
ジスタ１０にセツトされた命令のアドレスレジス
タ部R_B，R_Xは線４０を介してアドレスコンフリ
クト検出回路３２に入力され、そこで後述するよ
うにして、アドレスコンフリクトの有無が検出さ
れ、検出結果ACONFが第1Iユニツト２２Ａに線
５８を介して入力される。同様に、後述するよう
に条件コードコンフリクトの検出が回路３６で行
なわれ、検出結果CCONFが第1Iユニツト２２Ａ
に線７２を介して入力される。第1Iユニツト２２
Ａには第４図に示すように、制御回路（図示せ
ず）により命令レジスタ１０に新たな命令がセツ
トされるごとに、その制御回路によりセツトされ
るフリツプフロツプ７０を有する。さらに、キユ
ー制御回路７３は命令キユーレジスタ１４の各レ
ジスタが空いているか否かを表示するための三つ
のフリツプフロツプ５１８〜５２０を有し、これ
らの三つのフリツプフロツプのセツト状態によ
り、どのレジスタも空いていないときにキユービ
ジー信号BSYをANDゲート５２８から出力する
ようになつている。デコード成功決定回路７５
は、フリツプフロツプ７０の出力が１でコンフリ
クト信号ACONF、CCONF、キユービジー信号
BSYがいずれも０のときに、デコード成功信号
DSを線５６Ａに出力し、FF５１１でタイミング
調整を行つた後、、命令キユーレジスタ１４にこ
れを送り、デコーダ１２からの解読情報の取り込
みを指示する。命令キユーレジスタ１４内のどの
レジスタにこの解読情報を取り込むかを指示する
ための入力ポインタIPはキユー制御回路７３が
線５６Ａに出力し、FF５０５，５０６にてタイ
ミング調整を行つた後、命令キユーレジスタ１４
に送る。この入力ポインタIPを出力するため、
キユー制御回路７３は内部に０，１，２を順次カ
ウントするカウンタ５０２を有し、そのカウント
値を入力ポインタIPとして出力するようになつ
ている。このカウンタの更新はデコード成功信号
DSの立上がりから一サイクル遅れて行なわれる。
また、信号DSによりフリツプフロツプ７０がリ
セツトされ、制御回路（図示せず）がこのリセツ
トに応答して命令レジスタ１０に次の命令をセツ
トする。 For address conflict detection, the address register portions R _B , _R Presence is detected and the detection result ACONF is input to the first I unit 22A via line 58. Similarly, as will be described later, condition code conflict detection is performed in the circuit 36, and the detection result CCONF is sent to the first I unit 22A.
is input via line 72. 1st I unit 22
As shown in FIG. 4, A has a flip-flop 70 which is set by a control circuit (not shown) each time a new instruction is set in the instruction register 10 by the control circuit. Further, the queue control circuit 73 has three flip-flops 518 to 520 for indicating whether or not each register of the instruction queue register 14 is vacant.The set state of these three flip-flops determines whether any register is vacant. At times, the queue busy signal BSY is output from the AND gate 528. Decoding success determining circuit 75
In this case, the output of flip-flop 70 is 1 and the conflict signals ACONF, CCONF, and queue busy signals are output.
When both BSY are 0, decoding success signal
After outputting the DS to the line 56A and adjusting the timing by the FF 511, it is sent to the instruction queue register 14 to instruct the fetching of decoding information from the decoder 12. The input pointer IP for instructing which register in the instruction queue register 14 is to be loaded with this decoding information is outputted to the line 56A by the queue control circuit 73, and after timing adjustment is performed by FFs 505 and 506, the input pointer IP is sent to the instruction queue register 14.
send to To output this input pointer IP,
The queue control circuit 73 has an internal counter 502 that sequentially counts 0, 1, and 2, and outputs the count value as an input pointer IP. Update of this counter is a decoding success signal
It is performed one cycle after the rise of DS.
Further, the flip-flop 70 is reset by the signal DS, and a control circuit (not shown) sets the next instruction in the instruction register 10 in response to this reset.

今、命令キユーレジスタ１４内の０番、１番、
２番のレジスタに前述のＬ(1)，Ｌ(2)，Ａ命令がこ
の順でセツトされると仮定する。この結果、Ｌ(3)
命令は、Ｌ(1)命令の後に０番のレジスタにセツト
されることになる。また、各サイクルの最初およ
び中間のタイミングをＴ０，Ｔ１と呼ぶことにす
る。また添付した図面内のレジスタ又はフリツプ
フロツプの脇に記載されたＴ０又はＴ１はこれら
のレジスタはフリツプフロツプの内容が変更され
るタイミングがＴ０又はＴ１であることを示す。
さて、Ｌ(1)命令が命令レジスタ１０にタイミング
Ｃ０，Ｔ０（すなわち、サイクルＣ０内のタイミ
ングＴ０以下同様）にセツトされる。Ｌ(1)命令は
先行する他の命令に対してアドレスコンフリクト
又は条件コードコンフリクトがなく、また、命令
キユーレジスタ１４がビジーでないと仮定すると
デコーダ成功信号DSがタイミングＣ０，Ｔ１に
て出力され、入力ポインタIPはそのとき仮定に
より値０を有しているので、これらの信号によ
り、Ｌ(1)命令の解読情報は命令キユーレジスタ１
４内の０番のレジスタにタイミングＣ２，Ｔ０に
セツトされる。 Now, numbers 0 and 1 in the instruction queue register 14,
Assume that the L(1), L(2), and A instructions described above are set in the No. 2 register in this order. As a result, L(3)
The instruction will be set in register number 0 after the L(1) instruction. Furthermore, the first and middle timings of each cycle will be referred to as T0 and T1. Further, T0 or T1 written beside a register or flip-flop in the attached drawings indicates that the timing at which the contents of these registers or flip-flops are changed is T0 or T1.
Now, the L(1) instruction is set in the instruction register 10 at timings C0 and T0 (that is, at timings T0 and below in cycle C0). Assuming that the L(1) instruction has no address conflict or condition code conflict with other preceding instructions, and that the instruction queue register 14 is not busy, the decoder success signal DS is output at timings C0 and T1, and the input pointer is Since IP then has the value 0 by assumption, these signals cause the decoding information of the L(1) instruction to be stored in instruction queue register 1.
The timing C2 and T0 are set in the register No. 0 in No. 4.

この後、タイミングＣ１，Ｔ１に入力ポインタ
IPは１に更新される。タイミングＣ１，Ｔ０に
はＬ(2)命令が命令レジスタ１０にセツトされ、Ｌ
(2)命令のＤステージがＬ(1)命令のＤステージより
１サイクル遅れて、全く同じように実行される。
この結果命令キユーレジスタ１４内の１番のレジ
スタにＬ(2)命令の解読情報がセツトされ、かつ入
力ポインタは２に更新される。同様にして、Ａ命
令のＤステージがサイクルＣ２に実行され、その
解読情報が命令キユーレジスタ１４内の２番のレ
ジスタにセツトされ、かつIPは０に更新される。 After this, input pointer at timing C1, T1
IP is updated to 1. At timing C1, T0, the L(2) instruction is set in the instruction register 10, and the L(2) instruction is set in the instruction register 10.
(2) The D stage of the instruction is executed in exactly the same way, one cycle later than the D stage of the L(1) instruction.
As a result, the decoding information for the L(2) instruction is set in the first register in the instruction queue register 14, and the input pointer is updated to 2. Similarly, the D stage of the A instruction is executed in cycle C2, its decoding information is set in the second register in the instruction queue register 14, and IP is updated to 0.

タイミングＣ３，Ｔ０には次のＬ(3)命令が命令
レジスタ１０にセツトされる。しかし、このＬ(3)
命令は、仮定により、すでに命令キユーレジスタ
１４にセツトされているＡ命令に対してアドレス
コンフリクトがある。したがつて、アドレスコン
フリクト検出回路３２の出力ACONFが後述する
ように１となり、デコード成功決定回路７５はデ
コード成功信号DSをこのアドレスコンフリクト
の解除が検出されるまで出力しない。したがつ
て、Ｌ(3)命令は命令キユーレジスタ１４に取り込
まれず、Ｄステージの実行が延期される。本実施
例では、Ｌ(3)命令のＤステージが実行されるの
は、後述するごとく、サイクルＣ８である。した
がつて入力ポインタIPはそれまで値０を保持し
つづける。 At timing C3, T0, the next L(3) instruction is set in the instruction register 10. However, this L(3)
By assumption, the instruction has an address conflict with the A instruction which has already been set in the instruction queue register 14. Therefore, the output ACONF of the address conflict detection circuit 32 becomes 1 as described later, and the decoding success determining circuit 75 does not output the decoding success signal DS until the release of this address conflict is detected. Therefore, the L(3) instruction is not taken into the instruction queue register 14, and execution of the D stage is postponed. In this embodiment, the D stage of the L(3) instruction is executed in cycle C8, as will be described later. Therefore, the input pointer IP continues to hold the value 0 until then.

なお、Ｄ，D′ステージにおけるアドレス計算
は次のようなタイミングの下で行なわれる。 Note that address calculations in the D and D' stages are performed under the following timing.

Ｌ(1)命令がタイミングＣ０，Ｔ０にて命令レジ
スタ１０にセツトされた後、第２汎用レジスタ１
８Ｂからこの命令が指定するアドレス情報がただ
ちに読出され、アドレス加算器２４では、入力さ
れたデータを一端内部のレジスタ（図示せず）に
保持した後、１サイクルで加算を行い、その結果
を出力する。従つて、アドレス加算器２４が出力
するメモリアドレスはタイミングＣ２，Ｔ０に確
定する。 After the L(1) instruction is set in the instruction register 10 at timings C0 and T0, the second general-purpose register 1
The address information specified by this instruction is immediately read from 8B, and the address adder 24 stores the input data in an internal register (not shown), performs addition in one cycle, and outputs the result. do. Therefore, the memory address output by the address adder 24 is determined at timing C2, T0.

Ｌ(2)命令のためのアドレス計算はＬ(1)命令より
１サイクル遅れて全く同じようになされ、Ｌ(2)命
令のためのメモリアドレスはタイミングＣ３，Ｔ
０に確定する。 The address calculation for the L(2) instruction is performed in exactly the same way as the L(1) instruction, one cycle later, and the memory address for the L(2) instruction is calculated at timings C3 and T.
Set to 0.

第1Iユニツト２２Ａ内のキユー制御回路７３で
は、命令キユーレジスタ１４に新たな解読情報を
セツトするごとに、このレジスタ１４内の各レジ
スタが空いているか否かを表示するフリツプフロ
ツプ５１８〜５２０の内、該当するものをセツト
する。 Each time new decoding information is set in the instruction queue register 14, the queue control circuit 73 in the first I unit 22A selects one of the flip-flops 518 to 520 that indicates whether or not each register in this register 14 is vacant. Set what you want to do.

（Ａ，A′ステージの詳細）このステージでは、Ｄ，D′ステージで求めら
れたメモリアドレスに基づきメインメモリ２６か
らメモリオペランドが読出され、オペランドキユ
ーバツフア２８にセツトされる。すでに述べたご
とく、Ｌ(1)命令の解読情報が命令キユーレジスタ
１４内の０番のレジスタにセツトされている。し
たがつて、Ｌ(1)命令に対応するメモリオペランド
も、オペランドキユーバツフア２８内の０番のレ
ジスタにセツトされる。このため、第1Iユニツト
２２Ａは入力ポインタIP、デコード成功信号DS
をそれぞれ、フリツプフロツプ５１１〜５１３，
５０５〜５１０で2.5サイクル遅延した信号IPD，
DSDを線５７Ａを介してバツフア２８に供給す
るようになつている。この結果、Ｌ(1)命令のため
のメモリオペランドは、信号DSD，IPDに応答し
てタイミングＣ４，Ｔ０にバツフア２８内の０番
のレジスタにセツトされる。同様にＬ(2)命令のた
めのメモリオペランドはタイミングＣ５，Ｔ０に
バツフア２８内の１番のレジスタにセツトされ
る。(Details of Stages A and A') In this stage, a memory operand is read from the main memory 26 based on the memory address obtained in the stages D and D', and is set in the operand buffer 28. As already mentioned, the decoding information for the L(1) instruction is set in register number 0 in the instruction queue register 14. Therefore, the memory operand corresponding to the L(1) instruction is also set in register number 0 in the operand buffer 28. Therefore, the first I unit 22A inputs the input pointer IP and the decoding success signal DS.
are flip-flops 511 to 513, respectively.
Signal IPD delayed by 2.5 cycles from 505 to 510,
DSD is supplied to buffer 28 via line 57A. As a result, the memory operand for the L(1) instruction is set in the No. 0 register in the buffer 28 at timing C4, T0 in response to the signals DSD and IPD. Similarly, the memory operand for the L(2) instruction is set in the first register in the buffer 28 at timing C5, T0.

（L1ステージの詳細）各命令のL1ステージはそれに先行する命令の
E1ステージが終了したとき、すなわち、第1Eユ
ニツトから第１演算終了信号EOP１が出力され
たときに開始される。このL1ステージでは命令
キユーレジスタ１４とオペランドキユーバツフア
２８からそれぞれ一つの解読情報と一つのメモリ
オペランドがそれぞれセレクタ１６Ａ，３０Ａに
より選択され、第1Eユニツト２０Ａにセツトさ
れる。(Details of L1 stage) The L1 stage of each instruction is
It starts when the E1 stage ends, that is, when the first operation end signal EOP1 is output from the first E unit. In this L1 stage, one piece of decoding information and one memory operand are selected from the instruction queue register 14 and the operand queue buffer 28 by the selectors 16A and 30A, respectively, and set in the 1E unit 20A.

これらの制御は第1Iユニツト２２Ａが行う。す
なわち、第1Iユニツト２２Ａ内のキユー制御回路
７３は、セレクタ１６Ａ，３０Ａに選択すべきレ
ジスタ番号又はバツフア番号を示す出力ポインタ
OP１を線５３Ａを介して送出するカウンタ５３
２を有する。このカウンタは、０，１，２の値を
順次繰り返しカウントするものである。キユー制
御回路７３は、第1Eユニツト２０Ａから第１演
算終了信号EOP１が出力されると、１サイクル
後に第１演算開始信号BOP１を線５２Ａを介し
て１サイクルの間出力するとともに、信号BOP
１が出力されたサイクルの次のサイクルのタイミ
ングＴ０で出力ポインタOP１を更新する。 These controls are performed by the first I unit 22A. That is, the queue control circuit 73 in the first I unit 22A sends output pointers indicating the register number or buffer number to be selected to the selectors 16A and 30A.
Counter 53 sending OP1 via line 53A
It has 2. This counter repeatedly counts the values 0, 1, and 2 in sequence. When the first operation end signal EOP1 is output from the first E unit 20A, the queue control circuit 73 outputs the first operation start signal BOP1 for one cycle via the line 52A after one cycle, and also outputs the first operation start signal BOP1 for one cycle through the line 52A.
The output pointer OP1 is updated at timing T0 of the cycle following the cycle in which 1 was output.

後述するように信号EOP１は先行する命令の
最終のE1ステージの前半のサイクルのタイミン
グＴ０からその後信号BOP１が入力されるまで
１サイクルおきに第1Eユニツト２０Ａから出力
されるようになつている。Ｌ(1)命令に先行する命
令（これを今、命令Ｘと呼ぶ）のE1ステージが
サイクルＣ４，Ｃ５にて終了すると仮定すると、
信号EOP１はタイミングＣ４，Ｔ０から１サイ
クルの間出力される。このときキユー制御回路７
３は出力ポインタOP１として仮定によりＬ(1)命
令を選択するための値０を出力しており、またタ
イミングＣ５，Ｔ０から１サイクルの間、第１演
算開始信号BOP１を出力する。なお、出力ポイ
ンタOP１は、このBOP１に応答してタイミング
Ｃ６，Ｔ０において、１に更新される。このた
め、キユー制御回路７３には、フリツプフロツプ
５１８〜５２０の１サイクル遅延、２サイクル遅
延、３サイクル遅延となるフリツプフロツプ５２
１〜５２３，５４９〜５５１，５２４〜５２６お
よび、５２４〜５２６のうちOP１信号で示され
るものの値を選択するセレクタ５２７がある。さ
らにこのセレクタの出力とEOP１信号の論理積
をとるゲート５２９および、５２９の出力をＴ０
にてとりこむフリツプフロツプ５３０がある。５
３０の出力は第1Eユニツトにおける演算開始信
号BOP１である。またゲート５２９の出力が１
となると、上記フリツプフロツプ５１８〜５２
０，５２１〜５２３，５４９〜５５１，５２４〜
５２６のうちこの時のOP１の値で示される命令
キユーに対応するものが、リセツトされる。この
リセツトはデコーダ５３１によつて生成される信
号により行われる。 As will be described later, the signal EOP1 is outputted from the first E unit 20A every other cycle from the timing T0 of the first half cycle of the final E1 stage of the preceding instruction until the signal BOP1 is input thereafter. Assuming that the E1 stage of the instruction preceding the L(1) instruction (this is now called instruction X) ends in cycles C4 and C5,
The signal EOP1 is output for one cycle from timing C4, T0. At this time, the queue control circuit 7
3 outputs the value 0 for selecting the L(1) instruction by assumption as the output pointer OP1, and also outputs the first operation start signal BOP1 for one cycle from timing C5 and T0. Note that the output pointer OP1 is updated to 1 at timing C6, T0 in response to this BOP1. Therefore, the queue control circuit 73 includes a flip-flop 52 which has a one-cycle delay, a two-cycle delay, and a three-cycle delay from the flip-flops 518 to 520.
There is a selector 527 that selects the value of the signal indicated by the OP1 signal from 1 to 523, 549 to 551, 524 to 526, and 524 to 526. Furthermore, a gate 529 that ANDs the output of this selector and the EOP1 signal and the output of 529 are set to T0.
There is a flip-flop 530 that takes in the data. 5
The output of 30 is the operation start signal BOP1 in the 1st E unit. Also, the output of gate 529 is 1
Then, the above flip-flops 518 to 52
0,521~523,549~551,524~
Of the 526 instruction queues, those corresponding to the instruction queue indicated by the value of OP1 at this time are reset. This reset is performed by a signal generated by decoder 531.

以上のごとくして、サイクルＣ４，Ｃ５におい
て出力される出力ポインタOP１の値０に基づき、
セレクタ１６Ａ，３０ＡからＬ(1)命令の解読情
報、メモリオペランドの選択が行なわれる。 As described above, based on the value 0 of the output pointer OP1 output in cycles C4 and C5,
The decoding information and memory operand of the L(1) instruction are selected from the selectors 16A and 30A.

一方、サイクルＣ５において出力された第１演
算開始信号BOP１に基づき、第1Eユニツト２０
Ａはこれらの選択情報を取り込む。このとき、セ
レクタ１６Ａにより選択された解読情報の内、読
出しレジスタ番号R_Rは第１汎用レジスタ１８Ａ
に入力され、それに基づき読出されたレジスタオ
ペランドRDATA1が線４６Ａを介して第1Eユニ
ツト２０Ａに入力される。第1Eユニツト２０Ａ
はこのオペランドRDATA1も取込む。 On the other hand, based on the first calculation start signal BOP1 output in cycle C5, the 1E unit 20
A takes in these selection information. At this time, among the decoding information selected by the selector 16A, the read register number R _R is the first general-purpose register 18A.
The register operand RDATA1 read out based on the data is input to the 1E unit 20A via the line 46A. 1st E unit 20A
also takes in this operand RDATA1.

第1Eユニツト２０Ａは第５図にあるように第
１演算回路４００とフリツプフロツプ４０１，４
０３およびアンドゲート４０５からなる。第１演
算回路４００は、線４４Ａを介してセレクタ１６
Ａから入力されるOP CODE、書込みレジスタ番
号R_W、レジスタ変更表示信号CHGGR、条件コ
ード変更表示信号CHGCCと、線４６Ａを介し
て、第１汎用レジスタ１８Ａから入力されるレジ
スタオペランドRDATA1と、線４５Ａを介して
セレクタ３０Ａから入力されるメモリオペランド
MDATA1とを、線５２Ａを介して第1Iユニツト
から入力される演算開始信号BOP１に応答して、
その内部のレジスタ（図示せず）にセツトする。
一方、線４４Ａを介してセレクタ１６Ａから入力
される第２演算指示信号SUBGRは、信号BOP１
に応答してフリツプフロツプ４０１にセツトされ
る。こうして、第1Eユニツト２０Ａにおける入
力データに取り込が行なわれる。 The 1E unit 20A has a first arithmetic circuit 400 and flip-flops 401 and 4 as shown in FIG.
03 and an AND gate 405. The first arithmetic circuit 400 is connected to the selector 16 via a line 44A.
OP CODE, write register number R _W , register change display signal CHGGR, condition code change display signal CHGCC input from A, register operand RDATA1 input from the first general-purpose register 18A via line 46A, and line 45A. Memory operand input from selector 30A via
MDATA1 in response to the calculation start signal BOP1 inputted from the first I unit via the line 52A.
It is set in its internal register (not shown).
On the other hand, the second calculation instruction signal SUBGR input from the selector 16A via the line 44A is the signal BOP1.
In response to this, the flip-flop 401 is set. In this way, the input data in the first E unit 20A is captured.

今考えている例では、タイミングＣ６，Ｔ０に
おいて、Ｌ(1)命令に関連するデータが第1Eユニ
ツト２０Ａに取り込まれ、Ｌ(1)命令のL1ステー
ジが終了する。次の命令のL1ステージは、Ｌ(1)
命令に対する演算終了信号EOP１が出力される
タイミング（ここではサイクルＣ６）より実行さ
れる。また、Ａ命令のL1ステージは、Ｌ(2)命令
に対するEOP１が出力されるタイミングＣ８よ
り実行される。 In the example currently being considered, data related to the L(1) instruction is taken into the first E unit 20A at timing C6, T0, and the L1 stage of the L(1) instruction is completed. The L1 stage of the next instruction is L(1)
It is executed at the timing when the operation end signal EOP1 for the instruction is output (here, cycle C6). Further, the L1 stage of the A instruction is executed from timing C8 when EOP1 for the L(2) instruction is output.

しかし、Ｌ(3)命令は、Ａ命令のEOP１が出た
Ｃ１０サイクルにおいて、まだ命令解読及びオペ
ランド読み出しが終了していない（Ａ，A′ステ
ージがそれ以前に完了していない）ため、L1ス
テージは実行されない。Ｌ(3)命令のL1ステージ
が実行されるのは、Ｌ(3)命令のＡ，A′ステージ
が完了した後のＣ１２ステージ以降である。 However, in the C10 cycle when EOP1 of the A instruction is issued, the L(3) instruction has not finished decoding the instruction and reading the operands (the A and A' stages have not been completed before that), so the L1 stage is is not executed. The L1 stage of the L(3) instruction is executed after the C12 stage after the A and A' stages of the L(3) instruction are completed.

（E1ステージの詳細）第１演算回路４００は、入力されたOP CODE
にて指定される演算、たとえばＡ命令に対しては
加算を、レジスタオペランドRDATA1、メモリ
オペランドMDATA1に対して実行し、結果デー
タWDATA1を線５０Ａに出力するとともに、演
算の最終E1ステージの前半サイクルのタイミン
グＴ０からその後信号BOP１が、入力されるま
で演算終了信号EOP１を１サイクルおきに出力
する。このとき、入力された解読情報内の書込み
レジスタ番号R_Wは結果データWDATA1が算出
されるまで、回路４００内に保持されており、結
果データWDATA1とともに線５０Ａに出力され
るようになつている。また入力された条件コード
変更表示信号CHGCCが１のときには、回路４０
０はOPCODEと結果データWDATA1に依存し
て、条件コードCC１を算出し、この信号ととも
にセツト信号SET１をこの最終E1ステージの後
半サイクルで出力する。信号CHGCCが０のとき
には信号SET１は０のままである。さらに、回
路４００は取込まれているレジスタ変更表示信号
CHGGRが１のときには、結果データWDATA1
の出力と同期して、書込み信号WC１を線５０Ａ
に出力するように構成されている。回路４００
は、たとえばマイクロ命令制御の回路により構成
される。(Details of E1 stage) The first arithmetic circuit 400 receives the input OP CODE
For the operation specified by , for example, the A instruction, addition is performed on the register operand RDATA1 and memory operand MDATA1, and the result data WDATA1 is output to line 50A, and the first half cycle of the final E1 stage of the operation is The computation end signal EOP1 is output every other cycle from timing T0 until the signal BOP1 is input. At this time, the write register number R _W in the input decoding information is held in the circuit 400 until the result data WDATA1 is calculated, and is output to the line 50A together with the result data WDATA1. Furthermore, when the input condition code change display signal CHGCC is 1, the circuit 40
0 calculates the condition code CC1 depending on the OPCODE and result data WDATA1, and outputs this signal together with the set signal SET1 in the latter half cycle of the final E1 stage. When the signal CHGCC is 0, the signal SET1 remains 0. Further, the circuit 400 receives a registered register change indication signal.
When CHGGR is 1, the result data WDATA1
The write signal WC1 is connected to the line 50A in synchronization with the output of
is configured to output to . circuit 400
is constituted by, for example, a microinstruction control circuit.

一方、フリツプフロツプ４０１にタイミングＴ
０でセツトされた第２演算表示信号SUBGRはタ
イミングＴ０でフリツプフロツプ４０３に移され
る。このフリツプフロツプ４０３内の信号
SUBGRの反転信号と回路４００から線５０Ａに
出力される書込み信号WC１がアンドゲート４０
５に入力される。したがつて、ゲート４０５の出
力WC１２は、第１演算回路４００で実行された
演算が第2Eユニツトでは実行不可能でかつ汎用
レジスタを書きかえるもののときのみ１となる。
この出力WC１２は演算結果WDATA1を第２汎
用レジスタ１８Ｂに書込むのに用いられる。 On the other hand, the flip-flop 401 has a timing T
The second calculation display signal SUBGR set to 0 is transferred to the flip-flop 403 at timing T0. The signal inside this flip-flop 403
The inverted signal of SUBGR and the write signal WC1 output from the circuit 400 to the line 50A are connected to the AND gate 40.
5 is input. Therefore, the output WC12 of the gate 405 becomes 1 only when the operation executed by the first operation circuit 400 cannot be executed by the second E unit and the general-purpose register can be rewritten.
This output WC12 is used to write the calculation result WDATA1 into the second general-purpose register 18B.

（P1ステージの詳細）第1Eユニツト２０Ａから線５０Ａ上に出力さ
れた結果データWDATA1は、その線上の書込み
信号WC１，WC１２に基づき、第１、第２汎用
レジスタ１８Ａ，１８Ｂ内の、書込みレジスタ番
号R_Wを有するレジスタにタイミングＴ０で書き
込まれる。また、算出された条件コードCC１は
セツト信号SET１に応答して、レジスタ３４Ａ
に書込まれる。したがつて、レジスタ３４Ａは条
件コードを変更する命令の内、最も新しく実行さ
れた命令が出力した条件コードがセツトされてい
る。こうして、Ｌ(1)命令のP1ステージはサイク
ルＣ８，Ｃ９，Ｌ(2)命令はＣ１０，Ｃ１１，Ａ命
令はＣ１２，Ｃ１３で実行される。ただし、これ
ら３つの命令はいずれも第2Eユニツト２０Ｂで
実行可能のため（SUBGR＝１），WC１２＝０と
なり、結果データWDATA1は第２汎用レジスタ
１８Ｂには書込まれない。また、Ｌ(1)，Ｌ(2)命令
は条件コードを変更する命令でないのでレジスタ
３４Ａの内容は変更されない。またＡ命令の演算
により求まつた条件コードCC１はサイクルＣ１
０でレジスタ３４Ａにセツトされることになる。(Details of P1 stage) The result data WDATA1 output from the 1E unit 20A onto the line 50A is the write register number in the first and second general registers 18A and 18B based on the write signals WC1 and WC12 on that line. It is written to the register with R _W at timing T0. Further, the calculated condition code CC1 is set in the register 34A in response to the set signal SET1.
written to. Therefore, the condition code output by the most recently executed instruction among the instructions that change the condition code is set in the register 34A. In this way, the P1 stage of the L(1) instruction is executed in cycles C8 and C9, the L(2) instruction is executed in cycles C10 and C11, and the A instruction is executed in cycles C12 and C13. However, since these three instructions can all be executed by the second E unit 20B (SUBGR=1), WC12=0, and the result data WDATA1 is not written to the second general-purpose register 18B. Furthermore, since the L(1) and L(2) instructions are not instructions that change the condition code, the contents of the register 34A are not changed. Also, the condition code CC1 obtained by the operation of the A instruction is cycle C1.
It will be set to 0 in register 34A.

以上のようにして、Ｌ(1)，Ｌ(2)，Ａ命令のＤ〜
P1ステージが実行される。しかし、次のＬ(3)命
令に対してアドレスコンフリクトを有するので、
このコンフリクトが解消するまで、Ｌ(3)命令のＤ
ステージは実行されない。 As described above, D~ of L(1), L(2), A instruction
P1 stage is executed. However, since there is an address conflict with the next L(3) instruction,
D of L(3) instruction until this conflict is resolved.
Stage is not executed.

上記先頭に述べた先願によると、このコンフリ
クトが解消されるのは、Ａ命令の第2Eユニツト
での演算結果が得られた時以降であるが、Ａ命令
に先行するＬ(1)，Ｌ(2)命令が第１、第2Eユニツ
トのいずれでも演算を行い、しかも上記先願では
演算実行時間が同じであるため、Ａ命令における
第１、第2Eユニツトでの演算結果が得られる時
刻は同じである。従つて、上記先願において、Ａ
命令の第2Eユニツトでの演算結果が得られるの
は、Ｃ１２サイクルとなる。ゆえに、Ｌ(3)命令の
Ｄステージは、Ｃ１２サイクルから開始される。 According to the earlier application mentioned at the beginning of the above, this conflict is resolved after the operation result in the 2E unit of the A instruction is obtained, but the L(1), L (2) Since the instruction performs an operation in either the first or second E unit, and the operation execution time is the same in the above-mentioned prior application, the time at which the result of the operation in the first or second E unit for the A instruction is obtained is It's the same. Therefore, in the above prior application, A
The result of the operation in the 2nd E unit of the instruction is obtained in the C12 cycle. Therefore, the D stage of the L(3) instruction starts from the C12 cycle.

一方本実施例では、Ｌ(3)命令のＤステージの開
始を早めるために、第１、第2Eユニツトのいず
れでも演算を行う命令においては、第2Eユニツ
トでの演算サイクル数を第1Eユニツトにおける
演算サイクル数より短くし、しかも次命令の第
2Eユニツトにおける演算が前命令の演算終了後
直ちに開始できるようにしている。 On the other hand, in this embodiment, in order to hasten the start of the D stage of the L(3) instruction, for an instruction that performs an operation in either the 1st or 2nd E unit, the number of operation cycles in the 2nd E unit is changed to the number of operation cycles in the 1st E unit. shorter than the number of calculation cycles, and
The operation in the 2E unit can be started immediately after the operation of the previous instruction is completed.

以下この点をさらに詳しく説明する。 This point will be explained in more detail below.

（L2ステージの詳細）このステージは第2Eユニツト２０Ｂに必要な
データをセツトするステージである。このステー
ジは第2Iユニツト２２Ｂにより制御される。この
第2Iユニツト２２Ｂには第６図に示されるよう
に、命令キユーレジスタ１４内のレジスタ＃０〜
＃２に対応してフリツプフロツプ１０１〜１０３
がそれぞれ設けられ、これらのフリツプフロツプ
は命令キユーレジスタ１４内の対応するレジスタ
に有効な解読情報がセツトされているか否かを表
示するためのものである。すなわち、デコーダ１
００はデコード成功信号DSにより起動され、そ
こに入力される入力ポインタIPで示されるレジ
スタ番号に対応するフリツプフロツプ１０１〜１
０３のいずれかに対して１信号を出力する。この
信号は、フリツプフロツプ１０１〜１０３内の、
入力ポインタIPに対応する一つのフリツプフロ
ツプのデータ端子に入力され、さらにオアゲート
１０７〜１０９の一つを介して、その一つのフリ
ツプフロツプのクロツク端子に入力される。こう
して、入力ポインタIPに対応してフリツプフロ
ツプ１０１〜１０３の一つがセツトされる。な
お、これらのフリツプフロツプ１０１〜１０３は
タイミングＴ０でのみ入力データの取り込みを行
うものとする。さらにフリツプフロツプ５３８〜
５４０はそれぞれフリツプフロツプ１０１〜１０
３の出力を、フリツプフロツプ５４６〜５４８は
それぞれ５３８〜５４０の出力を、またフリツプ
フロツプ１２０〜１２２もそれぞれフリツプフロ
ツプ５４６〜５４８の出力を１サイクル遅延して
出力するためのもので、これらのフリツプフロツ
プ５３８〜５４０，５４６〜５４８，１２０〜１
２２もタイミングＴ０でのみ出力を変化するもの
とする。(Details of L2 stage) This stage is a stage where necessary data is set in the second E unit 20B. This stage is controlled by the second I unit 22B. As shown in FIG. 6, the second I unit 22B has registers #0 to #0 in the instruction queue register 14.
Flip-flops 101 to 103 corresponding to #2
These flip-flops are used to indicate whether or not valid decoding information is set in the corresponding register in the instruction queue register 14. That is, decoder 1
00 is activated by the decoding success signal DS, and flip-flops 101 to 1 correspond to the register number indicated by the input pointer IP input thereto.
1 signal is output for any one of 03. This signal is transmitted to the flip-flops 101-103.
It is input to the data terminal of one flip-flop corresponding to the input pointer IP, and is further input to the clock terminal of that one flip-flop via one of OR gates 107-109. In this way, one of the flip-flops 101-103 is set corresponding to the input pointer IP. It is assumed that these flip-flops 101 to 103 take in input data only at timing T0. Furthermore, flip-flop 538~
540 are flip-flops 101 to 10, respectively.
The flip-flops 546 to 548 output the outputs of the flip-flops 538 to 540, respectively, and the flip-flops 120 to 122 delay the outputs of the flip-flops 546 to 548 by one cycle. , 546-548, 120-1
22 also changes its output only at timing T0.

フリツプフロツプ１４９，１５０，１５３，１
５４、インクリメンタ１４８は０，１，２を順次
カウントするカウンタを構成し、フリツプフロツ
プ１５３，１５４の出力は、第2Eユニツト２０
Ｂで実行されるべき命令に関する情報の選択のた
めの出力ポインタOP２として用いられる。すな
わち、この出力ポインタOP２は、線５３Ｂを介
してセレクタ１６Ｂ，３０Ｂに入力され、実行す
べき命令の解読情報およびメモリオペランド
MDATA2がこれらのセレクタにより選択的に線
４４Ｂ，４５Ｂにそれぞれ出力されるようにな
る。しかし、これらの情報を第2Eユニツト２０
Ｂにセツトしてもよいのは第2Eユニツト２０Ｂ
において先行する命令のための演算が終了し、か
つ次にL2ステージを実行しようとする命令のた
めの情報が命令キユーレジスタ１４、オペランド
キユーバツフア２８にセツトされている場合であ
る。第2Eユニツト２０Ｂは後述のように第２演
算開始信号BOP２に応答してセレクタ１６Ｂ，
３０Ｂの出力をセツトし、演算の最終サイクルか
ら終了信号EOP２を出力するようになつている。
第2Iユニツト２２Ｂでは、出力ポインタOP２に
よりフリツプフロツプ１２０〜１２２の出力の一
つがセレクタ１２９により選択される。この選択
された信号は出力ポインタOP２で示される命令
キユーレジスタ１４内のレジスタに実行すべき命
令がセツトされていることを示している。したが
つて、第2Iユニツト２２Ｂは、線４８Ｂを介して
入力される信号EOP２とセレクタ１２９の出力
信号との論理積をとり、両方の信号がともに１の
ときにのみ演算開始信号BOP２を線５２Ｂに出
力するようになつている。 Flip-flop 149, 150, 153, 1
54, the incrementer 148 constitutes a counter that sequentially counts 0, 1, and 2, and the outputs of the flip-flops 153 and 154 are sent to the second E unit 20.
It is used as output pointer OP2 for selection of information regarding the instruction to be executed in B. That is, this output pointer OP2 is input to selectors 16B and 30B via line 53B, and contains decoding information and memory operands of the instruction to be executed.
MDATA2 is selectively output to lines 44B and 45B by these selectors. However, this information cannot be transferred to the 2nd E unit 20.
The unit that may be set to B is the 2nd E unit 20B.
This is a case where the operation for the preceding instruction has been completed and the information for the next instruction to be executed in the L2 stage has been set in the instruction queue register 14 and operand queue buffer 28. As described later, the second E unit 20B responds to the second operation start signal BOP2 by selecting the selector 16B,
30B is set, and an end signal EOP2 is output from the final cycle of calculation.
In the second I unit 22B, one of the outputs of the flip-flops 120 to 122 is selected by the selector 129 according to the output pointer OP2. This selected signal indicates that the instruction to be executed is set in the register in the instruction queue register 14 indicated by the output pointer OP2. Therefore, the second I unit 22B performs an AND operation between the signal EOP2 inputted through the line 48B and the output signal of the selector 129, and sends the calculation start signal BOP2 to the line 52B only when both signals are 1. It is now output to .

結局、このL2ステージでは次のようにして、
第2Eユニツト２０Ｂに必要な情報がセツトされ
る。セレクタ１６Ｂにより選択された解読情報の
内、OP CODE、書込みレジスタ番号R_W、第２
演算表示信号SUBGR、SUBCCおよび条件コー
ド変更表示信号CHGCCは線４４Ｂを介して直接
第2Eユニツト２０Ｂに送られ、読出しレジスタ
番号R_Rは第２汎用レジスタ１８Ｂに入力され、
レジスタオペランドRDATA2の読出しに用いら
れる。このオペランドは線４６Ｂを介して第2E
ユニツト２０Ｂに入力される。セレクタ３０Ｂに
より選択されたメモリオペランドMDATA2は線
４５Ｂを介して直接第2Eユニツト２０Ｂに送ら
れる。第2Eユニツト２０Ｂはこれらの情報を信
号BOP２に応答して取り込む。 In the end, in this L2 stage, do the following,
Necessary information is set in the second E unit 20B. Among the decoding information selected by the selector 16B, OP CODE, write register number R _W , and second
The calculation display signals SUBGR, SUBCC and the condition code change display signal CHGCC are sent directly to the second E unit 20B via the line 44B, and the read register number R _R is input to the second general-purpose register 18B.
Used to read register operand RDATA2. This operand is connected to the second E via line 46B.
The signal is input to unit 20B. The memory operand MDATA2 selected by selector 30B is sent directly to second E unit 20B via line 45B. The second E unit 20B takes in this information in response to the signal BOP2.

なお、第2Iユニツト２２Ｂでは、第６図に示す
ように、信号BOP２に応答してインクリメンタ
１４８が起動され、そのときの出力ポインタOP
２をカウントアツプした値を示す信号をフリツプ
フロツプ１４９，１５０に出力する。こうして出
力ポインタOP２は、信号BOP２が出力されるご
とに更新されることになる。また、信号BOP２
によりデコーダ１１３が起動される。このデコー
ダ１１３はそのときの出力ポインタの値に応じ
て、オアゲート１０７〜１０９の一つに信号１を
送る。このときデコーダ１００にはデコード成功
信号DSが入力されていないかあるいは信号DSが
入力されていてもそのときの入力ポインタの値は
出力ポインタOP２の値と異なるので、出力ポイ
ンタOP２に対応するフリツプフロツプ１０１又
は１０２又は１０３のデータ端子には、デコーダ
１００から信号１が入力されることはない。した
がつて、出力ポインタOP２に対応してフリツプ
フロツプ１０１〜１０３の一つがリセツトされる
ことになる。 In the second I unit 22B, as shown in FIG. 6, the incrementer 148 is activated in response to the signal BOP2, and the output pointer OP at that time is
A signal indicating a value obtained by counting up 2 is output to flip-flops 149 and 150. In this way, the output pointer OP2 is updated every time the signal BOP2 is output. Also, signal BOP2
The decoder 113 is activated. This decoder 113 sends a signal 1 to one of the OR gates 107 to 109 depending on the value of the output pointer at that time. At this time, the decoding success signal DS is not input to the decoder 100, or even if the signal DS is input, the value of the input pointer at that time is different from the value of the output pointer OP2, so the flip-flop 101 corresponding to the output pointer OP2 Alternatively, the signal 1 from the decoder 100 is never input to the data terminal 102 or 103. Therefore, one of the flip-flops 101-103 is reset corresponding to the output pointer OP2.

すでに述べたように、仮定では、Ｌ(1)命令のＤ
ステージのためのデコード成功信号DSはタイミ
ングＣ０，Ｔ１で、出力され、このときの入力ポ
インタIPは０であるが、フリツプフロツプ１０
１がタイミングＣ１，Ｔ０でセツトされ、フリツ
プフロツプ５３８，５４６，１２０がこれより順
に１サイクルずつ遅れたタイミングＣ２，Ｃ３，
Ｃ４、の各Ｔ０でセツトされる。このときフリツ
プフロツプ１５３，１５４より出力される出力ポ
インタOP２は仮定により、L2ステージを実行す
べきＬ(1)命令に対するものでなければならず、値
０を示す。したがつて、セレクタ１２９の出力は
タイミングＣ４，Ｔ０では１であり、このときＬ
(1)命令の前の命令Ｘに対する第2Eユニツトでの
演算が終了していると仮定しているので、信号
EOP２は１である。したがつて、信号BOP２も
１となる。こうして第2Eユニツト２０Ｂに対し
てＬ(1)命令の実行に必要なデータがセツトされ
る。こうして、Ｌ(1)命令のL2ステージがサイク
ルＣ４に行なわれる。また、この信号BOP２に
よりインクリメンタ１４８が起動され、そのとき
の出力ポインタOP２をカウントアツプした値１
を出力する。この値はフリツプフロツプ１４９，
１５０にタイミングＣ４，Ｔ１で取り込まれ、さ
らに、フリツプフロツプ１５３，１５４にタイミ
ングＣ５，Ｔ０で取り込まれる。したがつて、出
力ポインタOP２はサイクルＣ５では、次のＬ(2)
命令のための値１に更新される。 As already mentioned, the assumption is that D of the L(1) instruction
The decoding success signal DS for the stage is output at timing C0, T1, and the input pointer IP at this time is 0, but the flip-flop 10
1 is set at timings C1 and T0, and flip-flops 538, 546, and 120 are set at timings C2, C3, and C3, which are delayed by one cycle from this, in order.
It is set at each T0 of C4 and C4. At this time, the output pointer OP2 output from the flip-flops 153 and 154 must be for the L(1) instruction to execute the L2 stage, and indicates the value 0. Therefore, the output of the selector 129 is 1 at timing C4, T0, and at this time L
(1) Since it is assumed that the operation in the 2E unit for the instruction X before the instruction has been completed, the signal
EOP2 is 1. Therefore, the signal BOP2 also becomes 1. In this way, data necessary for executing the L(1) instruction is set in the second E unit 20B. Thus, the L2 stage of the L(1) instruction is performed in cycle C4. Also, the incrementer 148 is activated by this signal BOP2, and the value 1 which counts up the output pointer OP2 at that time is 1.
Output. This value is the flip-flop 149,
150 at timings C4 and T1, and further captured into flip-flops 153 and 154 at timings C5 and T0. Therefore, output pointer OP2 is next L(2) in cycle C5.
Updated to value 1 for instructions.

なお、信号BOP２によりデコーダ１１３がタ
イミングＣ４，Ｔ０で起動され、そのときの出力
ポインタOP２の値０に対するORゲート１０７お
よびフリツプフロツプ５３８，５４６，１２０の
リセツト入力端子Ｒに１信号を出力する。この結
果、フリツプフロツプ１０１，５３８，５４６，
１２０がタイミングＣ５，Ｔ０でリセツトされ
る。 The decoder 113 is activated at timing C4 and T0 by the signal BOP2, and outputs a 1 signal to the OR gate 107 and the reset input terminals R of the flip-flops 538, 546, and 120 for the value 0 of the output pointer OP2 at that time. As a result, flip-flops 101, 538, 546,
120 is reset at timing C5, T0.

全く同じようにＬ(2)，Ａ命令のL2ステージが
行なわれ、それぞれサイクルＣ５，Ｃ６はこの命
令のための信号BOP２が出力される。 The L2 stage of the L(2) and A instructions are performed in exactly the same way, and the signal BOP2 for this instruction is output in cycles C5 and C6, respectively.

この間、出力ポインタOP２はタイミングＣ５，
Ｔ０で１になり、次にタイミングＣ６，Ｔ０で２
にさらに、Ｃ７，Ｔ０で０に更新される。 During this time, the output pointer OP2 is set at timing C5,
It becomes 1 at T0, then becomes 2 at timing C6 and T0.
Furthermore, it is updated to 0 at C7 and T0.

ただし、次のＬ(3)命令はＡ命令に対してアドレ
スコンフリクトがあるため、この命令のためのデ
コード成功信号DSがタイミングＣ８，Ｔ１で出
力されるので、Ｌ命令のL2ステージはサイクル
Ｃ１２まで延期される。したがつて、出力ポイン
タOP２はタイミングＣ７，Ｔ０での値０をそれ
ぞれ保持しつづけている。 However, since the next L(3) instruction has an address conflict with the A instruction, the decoding success signal DS for this instruction is output at timing C8, T1, so the L2 stage of the L instruction is until cycle C12. Postponed. Therefore, the output pointer OP2 continues to hold the value 0 at timings C7 and T0, respectively.

タイミングＣ８，Ｔ１でＬ(3)命令のためのデコ
ード成功信号DSが出力されると、タイミングＣ
９，Ｔ０でフリツプフロツプ１０１がセツトされ
る。その後Ｌ(1)，Ｌ(2)，Ａ命令の場合と全く同様
にタイミングＣ１２，Ｔ０でＬ(3)命令のための演
算開始信号BOP２が出力される。 When the decoding success signal DS for the L(3) instruction is output at timing C8, T1, timing C8
At 9, T0, the flip-flop 101 is set. Thereafter, the calculation start signal BOP2 for the L(3) instruction is output at timing C12, T0, just as in the case of the L(1), L(2), and A instructions.

（E2ステージの動作）第2Eユニツト２０Ｂでは第７図に示すように
レジスタ２９８〜３０５にそれぞれ、第２演算表
示信号SUBCC、条件コード変更表示信号
CHGCC演算開始信号BOP２、レジスタ変更信号
SUBGR、書込みレジスタ番号R_W、オペレーシ
ヨンコードOP CODE、レジスタオペランド
RDATA2、メモリオペランドMDATA2が信号
BOP２に応答してL2ステージの動作によりセツ
トされている。第２演算回路３０７は信号BOP
２で起動され、レジスタ３０３にセツトされた
OP CODEにより指定される演算を行ない、演算
結果WDATA2をレジスタ３１０に送出する。ま
た、OP CODEが条件コードを変更する演算を指
示しているときには、条件コードCC２を演算結
果WDATA2と演算の種類に依存して算出して出
力する。また演算の最終サイクルで線４８Ｂに演
算終了信号EOP２を出力するようになつている。
この信号は、次にBOP２信号が入力されるまで
毎サイクル出力される。また、入力されたOP
CODEで指定される演算をこの第２演算回路３０
７が実行できないときでも、この回路３０７はこ
の信号EOP２を次に信号BOP２が入力されるま
で毎サイクル出力する。したがつて、現在、仮定
では、第２演算回路３０７は１マシンサイクルの
演算しか実行しないと仮定しているので、信号
EOP２は毎サイクル出力されることになる。ま
た、レジスタ３１０に演算結果WDATA2がタイ
ミングＴ０でセツトされ、レジスタ３０２にセツ
トされている書込みレジスタ番号R_Wがレジスタ
３０９にタイミングＴ０でセツトされ、同様にレ
ジスタ３００，３０１にそれぞれセツトされた信
号BOP２とSUBGRとが論理積ゲートを介してタ
イミングＴ０でレジスタ３０８にセツトされる。
レジスタ３０８の出力はその値が１のときに第２
汎用レジスタ１８Ｂに結果WDATA2を書込むべ
きことを示す書込信号WC２である。(Operation of E2 stage) In the second E unit 20B, as shown in FIG.
CHGCC operation start signal BOP2, register change signal
SUBGR, write register number R _W , operation code OP CODE, register operand
RDATA2, memory operand MDATA2 is a signal
It is set by the operation of the L2 stage in response to BOP2. The second arithmetic circuit 307 receives the signal BOP
2 and set in register 303.
It performs the operation specified by the OP CODE and sends the operation result WDATA2 to the register 310. Furthermore, when OP CODE indicates an operation that changes the condition code, the condition code CC2 is calculated and output depending on the operation result WDATA2 and the type of operation. Further, in the final cycle of calculation, a calculation end signal EOP2 is outputted to line 48B.
This signal is output every cycle until the next BOP2 signal is input. Also, the input OP
This second calculation circuit 30 performs the calculation specified by CODE.
7 cannot be executed, this circuit 307 outputs this signal EOP2 every cycle until the next signal BOP2 is input. Therefore, it is currently assumed that the second arithmetic circuit 307 executes only one machine cycle of arithmetic operations, so the signal
EOP2 will be output every cycle. Further, the calculation result WDATA2 is set in the register 310 at timing T0, the write register number R _W set in the register 302 is set in the register 309 at timing T0, and the signal BOP2 similarly set in the registers 300 and 301, respectively. and SUBGR are set in the register 308 at timing T0 via an AND gate.
The output of register 308 is 1 when its value is 1.
This is a write signal WC2 indicating that the result WDATA2 should be written to the general-purpose register 18B.

レジスタ２９８の出力VALIDは条件コードCC
２が有効であることを示す信号であり、第２演算
表示信号SUBCCが１である命令のL2ステージが
実行されたときに１となる。すなわち、L2ステ
ージが実行された命令の演算が第2Eユニツト２
０Ｂで実行可能であり、かつ、条件コードを変更
する命令のときに信号VALIDが１となり、それ
以外の命令のときには０となる。 The output VALID of register 298 is the condition code CC
2 is a signal indicating that it is valid, and becomes 1 when the L2 stage of an instruction whose second operation display signal SUBCC is 1 is executed. In other words, the operation of the instruction executed by the L2 stage is executed by the 2E unit 2.
The signal VALID becomes 1 when the instruction is executable with 0B and changes the condition code, and becomes 0 for other instructions.

また、レジスタ２９９の出力とレジスタ３００
の出力の論理積がアンドゲートから出力される。
この論理積SET２は、条件コードを変更する命
令のＥステージで１となり、それ以外の命令では
０となる。この信号SET２は、条件コードCC２
と有効表示信号VALIDを条件コードレジスタ３
４Ｂ（第１Ｂ図）にセツトするのに用いられる。 Also, the output of register 299 and register 300
The AND gate outputs the AND gate.
This logical product SET2 becomes 1 in the E stage of an instruction that changes the condition code, and becomes 0 in other instructions. This signal SET2 has condition code CC2
and valid display signal VALID to condition code register 3
4B (Figure 1B).

Ｌ(1)，Ｌ(2)，Ａ命令に対しては各々サイクルＣ
５，Ｃ６，Ｃ７で演算が行なわれ、結果データ
WDATA2、書込レジスタ番号R_W、書込み信号
WC２がそれぞれタイミングＣ６，Ｃ７，Ｃ８の
Ｔ０で線５０Ｂに出力されることになる。Ｌ(3)命
令についてもサイクルＣ１３で演算が同じように
行なわれる。 Cycle C for L(1), L(2), and A instructions, respectively.
Calculation is performed in 5, C6, and C7, and the result data
WDATA2, write register number R _W , write signal
WC2 is output to line 50B at timings C6, C7, and C8, respectively, at T0. The same calculation is performed for the L(3) instruction in cycle C13.

またＡ命令においては、条件コードCC２とそ
のセツト信号SET２、有効信号VALIDがタイミ
ングＣ７，Ｔ０にて線７０Ｂに出力される。 Further, in the A instruction, the condition code CC2, its set signal SET2, and valid signal VALID are output to the line 70B at timings C7 and T0.

（P2ステージの詳細） E2ステージでこのように求められた結果デー
タWDATA2は線５０Ｂを介して第２汎用レジス
タ１８Ｂに送られ、書込み信号WC２が１のとき
に番号R_Wで示されるレジスタに書込まれる。(Details of the P2 stage) The result data WDATA2 obtained in this way at the E2 stage is sent to the second general-purpose register 18B via the line 50B, and when the write signal WC2 is 1, it is written to the register indicated by the number R _W. be included.

また、セツト信号SET２が１のときには、条
件コードレジスタ３４Ｂ（第１Ｂ図）に条件コー
ドCC２と有効表示信号VALIDがセツトされる。
セツト信号SET２が０のときには、レジスタ３
４Ｂの内容はかわらない。したがつて、E2ステ
ージを実行される命令が条件コードをかえる命令
のときには、レジスタ３４Ｂの内容が書きかえら
れる。したがつて、この命令が第2Eユニツトで
実行可能なときには、レジスタ３４Ｂの新しい内
容は、値１をもつVALID信号と第2Eユニツト２
０Ｂで新たに求められた条件コードCC２である。
しかし、この命令が第2Eユニツト２０Ｂで実行
できない命令のときには、レジスタ３４Ｂの新し
い内容は値０をもつVALID信号と、第2Eユニツ
ト２０Ｂから線７０Ｂに出力されている無意味な
データである。一方、E2ステージが実行された
命令が条件コードをかえない命令のときには、レ
ジスタ３４Ｂの内容は書きかえられない。 Further, when the set signal SET2 is 1, the condition code CC2 and the valid display signal VALID are set in the condition code register 34B (FIG. 1B).
When set signal SET2 is 0, register 3
The contents of 4B remain unchanged. Therefore, when the instruction executed in the E2 stage is an instruction that changes the condition code, the contents of the register 34B are rewritten. Therefore, when this instruction is executable in the 2nd E unit, the new contents of register 34B are the VALID signal with the value 1 and the 2nd E unit 2
This is the newly found condition code CC2 in 0B.
However, if this instruction is one that cannot be executed by the second E unit 20B, the new contents of the register 34B are the VALID signal with a value of 0 and meaningless data being output from the second E unit 20B on line 70B. On the other hand, if the instruction executed in the E2 stage is an instruction that does not change the condition code, the contents of the register 34B cannot be rewritten.

Ｌ(1)，Ｌ(2)，Ａ命令については、それぞれタイ
ミングＣ７，Ｃ８，Ｃ９のＴ０にて演算結果が第
２汎用レジスタに書き込まれ、またＬ(3)命令につ
いてはタイミングＣ１５，Ｔ０にて書き込まれ
る。また、Ａ命令については、タイミングＣ８，
Ｔ０にて条件コードと値１を持つVALID信号が
レジスタ３４Ｂにセツトされる。 For the L(1), L(2), and A instructions, the operation results are written to the second general-purpose register at timings C7, C8, and C9, respectively, and for the L(3) instructions, they are written at timings C15 and T0. is written. Furthermore, for the A instruction, timing C8,
At T0, a condition code and a VALID signal having a value of 1 are set in register 34B.

本実施例においては従つて、Ａ命令の結果はＣ
９，Ｔ０にて第２汎用レジスタに書き込まれるた
め、この時点でこの結果をアドレス計算で用いる
必要のあるＬ(3)命令とのアドレスコンフリクトが
解消し、後述するようにアドレスコンフリクト検
出回路３２の出力ACONFが０となり、第1Iユニ
ツト２０Ａは次のＬ(3)命令のＤステージを開始で
きる。従つて後続のＬ(3)命令のＤステージはＣ８
にて開始することができ、上記先願に比較して４
ステージ（先願における２サイクル）早めること
ができる。 Therefore, in this embodiment, the result of the A instruction is C
9, written to the second general-purpose register at T0, the address conflict with the L(3) instruction, which requires using this result in address calculation, is resolved at this point, and the address conflict detection circuit 32 The output ACONF becomes 0, and the first I unit 20A can start the D stage of the next L(3) instruction. Therefore, the D stage of the subsequent L(3) instruction is C8.
4 compared to the earlier application mentioned above.
The stage (2 cycles in the earlier application) can be accelerated.

この４サイクルの短縮の内訳けは、Ｄステージ
の開始後第２汎用レジスタに書き込みが終了する
までのサイクル数の短縮分２サイクルと、第2E
ユニツトが第1Eユニツトに比べＬ(1)，Ｌ(2)命令
を半分のピツチで演算を行えるため、１命令毎に
１サイクルずつ計２サイクル早期に第2Eユニツ
トでの演算が開始できたことによる短縮分２サイ
クルである。 The breakdown of this 4-cycle reduction is 2 cycles from the start of the D stage until the end of writing to the second general-purpose register, and 2 cycles from the start of the D stage until the end of writing to the second general-purpose register
Since the unit can perform calculations on L(1) and L(2) instructions at half the pitch compared to the 1st E unit, calculations in the 2nd E unit can start 1 cycle for each instruction, a total of 2 cycles earlier. This is a reduction of 2 cycles.

ここで、第2Eユニツトが上記先願に比べ、実
際に半分のピツチで演算を行える上で、命令のデ
コードそのものが先願に比べ半分のピツチで行わ
れる必要があること、そして本実施例においてそ
のように構成していることは言うまでもない。 Here, in addition to the fact that the 2E unit can actually perform operations with half the pitch compared to the earlier application, the instruction decoding itself needs to be performed with half the pitch compared to the earlier application, and in this embodiment, Needless to say, it is configured in this way.

（アドレスコンフリクト検出動作）アドレスコンフリクト検出回路３２の構成は、
第８図に示すように上記先願におけると同様でよ
く、ここでは概略のみ説明する。(Address conflict detection operation) The configuration of the address conflict detection circuit 32 is as follows.
As shown in FIG. 8, it may be the same as that in the above-mentioned prior application, and only the outline will be explained here.

フリツプフロツプ２００〜２０２は、命令キユ
ーレジスタ１４の＃０〜＃２の各レジスタに対応
し、その中に入つている命令が汎用レジスタを変
更する命令であり、しかも未だ変更を終えていな
い状態であることを示す。レジスタ２１８〜２２
０は、命令キユーレジスタ１４の＃０〜＃２の各
レジスタに対応し、その中に汎用レジスタを変更
する命令が入つている場合、その書き込みレジス
タ番号R_Wを保持する。比較器２２４は、命令レ
ジスタに保持されている命令のインデツクスレジ
スタ及びベースレジスタ番号R_X，R_Bと、命令キ
ユーレジスタ内の汎用レジスタの変更の終つてい
ない命令の書き込みレジスタ番号とを比較し、比
較結果をオアゲート２３５に出力する。 Flip-flops 200 to 202 correspond to registers #0 to #2 of the instruction queue register 14, and the instructions contained therein are instructions that modify general-purpose registers, and the modification has not yet been completed. shows. Registers 218-22
0 corresponds to each register #0 to #2 of the instruction queue register 14, and if an instruction to change a general-purpose register is contained therein, the write register number _RW is held therein. The comparator 224 compares the index register and base register numbers R _X and R _B of the instruction held in the instruction register with the write register number of the instruction whose general register in the instruction queue register has not yet been changed. , and outputs the comparison result to the OR gate 235.

セレクタ２５６は、フリツプフロツプ２００〜
２０２の値のうち、第1Eユニツトでの演算が開
始したものを選択し、フリツプフロツプ２５８に
出力する。 The selector 256 selects flip-flops 200~
Among the values 202, the one whose calculation has started in the first E unit is selected and output to the flip-flop 258.

またセレクタ２２７は、レジスタ２１８〜２２
０に格納されている書き込みレジスタ番号のう
ち、第1Eユニツトでの演算が開始したものを選
択し、レジスタ２２８に出力する。比較器２３０
は、命令レジスタに保持されている命令のR_X，
R_Bと、第1Eユニツトでの演算中の、汎用レジス
タの変更の終つていない命令の書き込みレジスタ
番号とを比較し、比較結果をオアゲート２３５に
出力する。オアゲート２３５は線５８にアドレス
コンフリクトの発生を示すACONF信号を出力す
る。 Further, the selector 227 selects the registers 218 to 22.
Among the write register numbers stored in 0, the one whose operation has started in the 1E unit is selected and output to the register 228. Comparator 230
is the instruction R _X held in the instruction register,
It compares R _B with the write register number of an instruction that has not yet finished changing a general-purpose register during an operation in the 1E unit, and outputs the comparison result to the OR gate 235. OR gate 235 outputs an ACONF signal on line 58 indicating the occurrence of an address conflict.

フリツプフロツプ２００〜２０２及びレジスタ
２１８〜２２０へのセツトは、それぞれセツト回
路２５０，２２５によつてDS信号の立つた時に、
IP信号で示される番号のものに対してなされる。
フリツプフロツプ２００〜２０２のリセツトは、
第2Eユニツトの演算結果の第２汎用レジスタへ
の書き込み信号WC２Ｇが立つた時にリセツト回
路２５４によつて、また第2Eユニツトでは演算
の行われない命令については第1Eユニツトでの
演算が開始した時にリセツト回路２５２によつて
行われる。 The flip-flops 200-202 and registers 218-220 are set by set circuits 250 and 225, respectively, when the DS signal rises.
This is done for the number indicated by the IP signal.
To reset flip-flops 200-202,
The reset circuit 254 writes the calculation result of the 2nd E unit to the 2nd general-purpose register when the signal WC2G rises, and when the calculation is started in the 1st E unit for an instruction that is not performed in the 2nd E unit. This is done by the reset circuit 252.

レジスタ２３７，５４２は、OP１の遅延信号
を作成する。レジスタ２３６，５４３はBOP１
の遅延信号を作成する。 Registers 237 and 542 create a delayed signal for OP1. Registers 236 and 543 are BOP1
Create a delayed signal.

第９Ｃ図に、フリツプフロツプ２００〜２０
２，２５８、レジスタ２１８〜２２０，２２８、
比較器２２４，２３０、ACONF信号５８のタイ
ムチヤートを示す。 FIG. 9C shows flip-flops 200-20
2,258, registers 218 to 220,228,
A time chart of the comparators 224, 230 and the ACONF signal 58 is shown.

〔動作の詳細〕条件コードコンフリクトがある場合、この場合
の装置とその動作説明のために以下では、Ｌ(1)，
Ａ，BCがこの順に実行され、Ｌ(2)命令が実行さ
れるものとする。また、これらの命令間およびこ
れらに先行する命令とこれらの命令の間でアドレ
スコンフリクトがないと仮定する。[Details of operation] When there is a condition code conflict, L(1),
It is assumed that A and BC are executed in this order, and the L(2) instruction is executed. It is also assumed that there are no address conflicts between these instructions or between the instructions preceding them and these instructions.

（BC命令の処理）このときの動作のタイムチヤートは第１１Ａ〜
１１Ｂ図に示される。ただし、この図では、条件
コードコンフリクトの動作の理解に必要な信号の
みを示してある。(Processing of BC command) The time chart of this operation is from 11A to
11B. However, this diagram only shows the signals necessary to understand the behavior of condition code conflicts.

Ｌ(1)，Ａ命令は先に述べたのと全く同じように
実行され、BC命令のＤステージがサイクルＣ２
で実行されると仮定する。 The L(1) and A instructions are executed exactly as described above, and the D stage of the BC instruction is cycle C2.
Assume that it is executed in

BC命令のＤ，D′ステージにおいては、Ｌ(1)，
Ａ命令と同様に加算器２４でアドレスが算出され
るが、このアドレスは分岐先の命令のアドレスで
ある。BC命令の命令デコーダ１２による解読情
報は、他の命令と同じようにデコーダ成功信号
DS（これはタイミングＣ２，Ｔ１で出力される）
に応答して命令キユーレジスタ１４に格納され
る。また、命令デコーダ１２の出力の内、分岐命
令表示信号BC及びマスク信号MASKが線４２を
介して条件コードコンフリクト検出回路３６に送
られる。回路３６は、後述するように条件コード
コンフリクト信号CCONFをタイミングＣ３，Ｔ
０で１にする。第1Iユニツト２２Ａはこの信号に
応答して、後続の命令に対するデコード成功信号
DSの発生を抑止する。こうしてBC命令のＤ，
D′ステージが終了する。BC命令のＡ，A′ステー
ジにおいては上述の命令アドレスに基づきメイン
メモリ２６からタイミングＣ５，Ｔ０で読出され
たＬ(2)命令がフエツチ回路（図示せず）の制御に
より命令バツフア６Ａ，６Ｂの内のターゲツトス
トリーム側にタイミングＣ６，Ｔ０に格納され
る。フエツチ回路はさらに引続き、このＬ(2)命令
につづく命令列を順次メインメモリ２６から読出
し、ターゲツトストリーム側の命令バツフア６Ａ
又は６Ｂに順次格納する。 In the D and D' stages of the BC instruction, L(1),
As with the A instruction, an address is calculated by the adder 24, and this address is the address of the branch destination instruction. The decoding information of the BC instruction by the instruction decoder 12 is the same as the decoder success signal as for other instructions.
DS (This is output at timing C2, T1)
The command is stored in the instruction queue register 14 in response to the command. Further, among the outputs of the instruction decoder 12, a branch instruction display signal BC and a mask signal MASK are sent to the condition code conflict detection circuit 36 via a line 42. The circuit 36 outputs the condition code conflict signal CCONF at timings C3 and T, as will be described later.
Set 0 to 1. In response to this signal, the 1st I unit 22A outputs a decoding success signal for subsequent instructions.
Prevent the occurrence of DS. Thus, the D of the BC command,
D′ stage ends. In the A and A' stages of the BC instruction, the L(2) instruction read from the main memory 26 at timings C5 and T0 based on the above-mentioned instruction address is transferred to the instruction buffers 6A and 6B under the control of a fetch circuit (not shown). The data is stored on the target stream side at timing C6, T0. The fetch circuit further sequentially reads out the instruction sequence following this L(2) instruction from the main memory 26 and stores it in the instruction buffer 6A on the target stream side.
Or sequentially store in 6B.

サイクルＣ６においては、Ａ命令の第2Eユニ
ツト２０Ｂでの演算が終了するため、信号EOP
２がこのサイクルにおいて出力される。したがつ
て、第2Iユニツト２２Ｂより信号BOP２がこの
サイクルＣ６で出力される。このため、BC命令
のL2ステージがサイクルＣ６において可能とな
る。 In cycle C6, since the operation of the A instruction in the second E unit 20B is completed, the signal EOP is
2 is output in this cycle. Therefore, the signal BOP2 is output from the second I unit 22B in this cycle C6. Therefore, the L2 stage of the BC instruction becomes possible in cycle C6.

BC命令のL2ステージにおいては、セレクタ１
６Ｂにより選択された、BC命令の解読情報が第
2Eユニツト２０Ｂにセツトされる。BC命令が必
要とする演算は、この解読情報に基づく分岐成功
の判定である。しかし、他の命令と異なり、本実
施例ではこの判定が条件コードコンフリクト検出
回路３６によりなされる。したがつて、セレクタ
１６Ｂの出力は線４４Ｂを介して、この検出回路
３６に送られ、そこにセツトされる。なお、セレ
クタ１６Ｂの出力は他の命令と同じく、第2Eユ
ニツト２０Ｂにも送られる。 In the L2 stage of the BC instruction, selector 1
The BC instruction decoding information selected by 6B is
2E unit 20B is set. The operation required by the BC instruction is to determine branch success based on this decoding information. However, unlike other instructions, this determination is made by the condition code conflict detection circuit 36 in this embodiment. The output of selector 16B is therefore sent via line 44B to this detection circuit 36 and set therein. Note that the output of the selector 16B is also sent to the second E unit 20B like other commands.

また、L2ステージではセレクタ３０ＢがBC命
令に関するデータを選択するように制御される。
L2ステージの次のステージでは、上述のごとく、
分岐成功判定が回路３６で行われ、第2Eユニツ
ト２０Ｂは実質的に動作しないで、演算終了信号
EOP２をサイクルＣ７から出力するのみである。
しかし、他の命令と同じくこのステージをE2ユ
ニツトステージと呼ぶことにする。 Furthermore, in the L2 stage, the selector 30B is controlled to select data related to the BC instruction.
In the next stage after the L2 stage, as mentioned above,
A branch success determination is made in the circuit 36, and the second E unit 20B does not substantially operate and receives an operation end signal.
EOP2 is only output from cycle C7.
However, like other instructions, this stage will be called the E2 unit stage.

このE2ステージにおいては、条件コードコン
フリクト検出回路３６がレジスタ３４Ｂの出力に
基づく分岐成功判定が可能かどうかを検出する。
この分岐判定がE2ステージで可能となるのは、
BC命令の前の命令が条件コードを変更する命令
でかつ第2Eユニツト２０Ｂで実行できる命令で
ある。したがつて、レジスタ３４Ｂ内の信号
VALIDが１のときである。分岐判定が可能なと
きには、判定の終了時に、コンフリクト信号
CCONFを０にする。また、判定の結果、分岐が
成功のときには分岐成功信号BCTKNを線７４に
出力する。一方、分岐判定不可能のときには、信
号CCONFは変化しないし、信号BCTKNは出力
されない。 In this E2 stage, the condition code conflict detection circuit 36 detects whether a branch success determination is possible based on the output of the register 34B.
This branch judgment is possible at the E2 stage because
The instruction before the BC instruction is an instruction that changes the condition code and can be executed by the second E unit 20B. Therefore, the signal in register 34B
This is when VALID is 1. When a branch decision is possible, a conflict signal is generated at the end of the decision.
Set CCONF to 0. Further, if the result of the determination is that the branch is successful, a branch success signal BCTKN is output to the line 74. On the other hand, when branch determination is not possible, the signal CCONF does not change and the signal BCTKN is not output.

今の場合、先行するＡ命令は第2Eユニツト２
０Ｂで実行可能で、かつ、条件コードを変更する
命令であるので、第2Eユニツト２０Ｂから、Ａ
命令のE2ステージの終了後には、レジスタ３４
Ｂ内の信号VALIDは値１を有する。したがつて、
回路３６は分岐判定可能なものとして、レジスタ
３４Ｂから入力される条件コードCC２と、命令
デコーダ１２からすでに入力されているマスク信
号MASKにより分岐判定を行う。この結果、分
岐成功と判定されたと仮定すると、信号BCTKN
が１となる。これによりフリツプフロツプ９が
（第１Ａ図）が反転される。また、条件コードコ
ンフリクト信号CCONFはタイミングＣ９，Ｔ０
でゼロになる。こうして、BC命令のE2ステージ
が終了する。 In this case, the preceding A instruction is the 2nd E unit 2.
Since it is an instruction that can be executed in 0B and changes the condition code, A
After completing the E2 stage of the instruction, register 34
The signal VALID in B has the value 1. Therefore,
The circuit 36 is capable of making a branch decision and makes a branch decision based on the condition code CC2 inputted from the register 34B and the mask signal MASK already inputted from the instruction decoder 12. As a result, assuming that the branch is successful, the signal BCTKN
becomes 1. This causes flip-flop 9 (FIG. 1A) to be inverted. Also, the condition code conflict signal CCONF is at timing C9, T0.
becomes zero. In this way, the E2 stage of the BC instruction ends.

次のステージ（これをP2ステージと呼ぶ）で
は次の動作が行なわれる。すなわち、フリツプフ
ロツプ９の反転に伴ない、読出し回路８（第１Ａ
図）は分岐先のＬ(2)命令をタイミングＣ８，Ｔ０
で命令レジスタ１０にセツトする。また、信号
CCONFが０であり、また、仮定によりアドレス
コンフリクトがないと仮定しているので、第1Iユ
ニツト２２Ａは次の命令に対するデコード成功信
号DSをタイミングＣ８，Ｔ１で出力する。こう
して、分岐成功の場合、分岐先のＬ(2)命令のＤス
テージがサイクルＣ８で実行可能となる。なお、
読出し回路８はBC命令のＤステージの開始後、
分岐不成功時に実行すべき命令を命令レジスタ１
０にあらかじめセツトする。したがつて、分岐判
定の結果、分岐不成功のときには、信号DSに応
答して、分岐不成功側の命令が実行される。 In the next stage (this is called the P2 stage), the following operations are performed. That is, as the flip-flop 9 is inverted, the readout circuit 8 (first A
Figure) shows the branch destination L(2) instruction at timing C8, T0.
is set in instruction register 10. Also, the signal
Since CCONF is 0 and it is assumed that there is no address conflict, the first I unit 22A outputs the decoding success signal DS for the next instruction at timing C8 and T1. In this way, if the branch is successful, the D stage of the L(2) instruction at the branch destination can be executed in cycle C8. In addition,
After the start of the D stage of the BC instruction, the readout circuit 8
Instructions to be executed when a branch fails are stored in instruction register 1.
Preset to 0. Therefore, when the result of branch determination is that the branch is unsuccessful, the instruction on the side of the unsuccessful branch is executed in response to the signal DS.

一方、上記先願においては、Ａ命令の第2Eユ
ニツトの演算はＣ８，Ｃ９サイクルにて行われ、
分岐判定はＣ１０サイクルに行われ、従つて、分
岐先命令Ｌ(2)のＤステージはＣ１２から開始され
る。 On the other hand, in the above-mentioned prior application, the operation of the 2nd E unit of the A instruction is performed in the C8 and C9 cycles,
Branch determination is made in the C10 cycle, and therefore the D stage of the branch destination instruction L(2) starts from C12.

ゆえに本発明によればＬ(2)のＤステージは上記
従来例に比較して４サイクル早めることができ
る。この４サイクルの短縮の内訳けは、Ｄステー
ジの開始後、第2Eユニツトによつて条件コード
が得られるまでのサイクル数の短縮分２サイクル
と、第2Eユニツトが第1Eユニツトに比べＬ(1)命
令を半分のピツチで演算を行えるため、Ａ命令の
第2Eユニツトでの演算が１サイクル早期に開始
できることによる短縮分１サイクルと、条件コー
ドが得られてから分岐判定及び命令レジスタへの
分岐先命令切り出しを行うまでのサイクル数の短
縮による１サイクルである。 Therefore, according to the present invention, the D stage of L(2) can be advanced by 4 cycles compared to the conventional example. This 4-cycle reduction consists of 2 cycles reduction in the number of cycles from the start of the D stage until the condition code is obtained by the 2nd E unit, and 2 cycles reduction in the number of cycles required for the 2nd E unit to obtain the condition code, and ) instruction at half the pitch, the operation in the 2E unit of the A instruction can be started one cycle earlier, which saves one cycle, and the branch decision and branch to the instruction register are made after the condition code is obtained. This is one cycle due to the reduction in the number of cycles until the first instruction is extracted.

（条件コードコンフリクト検出回路とその動作）条件コードコンフリクト検出回路３６の構成は
第１０図に示すように、上記従来例におけるとほ
ぼ同様でよく、ここでは概略のみ説明する。(Condition Code Conflict Detection Circuit and Its Operation) As shown in FIG. 10, the configuration of the condition code conflict detection circuit 36 may be substantially the same as that in the conventional example, and only the outline will be described here.

フリツプフロツプ４０６は、条件コードコンフ
リクトの発生を示すCCONF信号を線７２に出力
する。４０６はBC命令についてDS信号が立つた
時にセツトされ、分岐判定回路４１７からの分岐
判定終了信号ENDに応答して、制御回路４０４
によりセツトされる。 Flip-flop 406 outputs a CCONF signal on line 72 indicating the occurrence of a condition code conflict. 406 is set when the DS signal rises for the BC instruction, and in response to the branch judgment end signal END from the branch judgment circuit 417, the control circuit 404
It is set by

アンドゲート４０９はBC命令の第1Eユニツト
での演算開始条件をフリツプフロツプ４１３に入
力する。４１３はこれを、CC１を用いて分岐判
定を行う場合の判定指令信号Ｊ１として、分岐判
定回路４１７に出力する。 The AND gate 409 inputs the operation start condition in the 1st E unit of the BC instruction to the flip-flop 413. 413 outputs this to the branch determination circuit 417 as a determination command signal J1 when branch determination is performed using CC1.

アンドゲート４１０はBC命令の第2Eユニツト
での演算開始条件をフリツプフロツプ４１４に入
力する。４１４はこれを、CC２を用いて分岐判
定を行う場合の判定指令信号Ｊ２として、分岐判
定回路４１７に出力する。 The AND gate 410 inputs the operation start condition in the second E unit of the BC instruction to the flip-flop 414. 414 outputs this to the branch determination circuit 417 as a determination command signal J2 when performing branch determination using CC2.

レジスタ４０３は、BC命令のマスク値MASK
を格納し、分岐判定回路４１７に出力する。 Register 403 contains the mask value MASK of the BC instruction.
is stored and output to the branch determination circuit 417.

分岐判定回路４１７は、分岐判定終了を意味す
るEND信号と、分岐成功を示すBCTKN信号と
を出力する。 Branch determination circuit 417 outputs an END signal indicating the end of branch determination and a BCTKN signal indicating branch success.

CCONF，Ｊ１，Ｊ２，END，BCTKN及びそ
れらに関連する信号のタイムチヤートを第１１Ｂ
図に示す。 The time chart of CCONF, J1, J2, END, BCTKN and their related signals is shown in 11B.
As shown in the figure.

〔Effect of the invention〕

本発明によれば、システムプログラムのように
Load，Add等の演算ステージの短い命令が大半
を占めるプログラムにおいても第2Eユニツトに
よつて第1Eユニツトに比較して早期に演算結果
を求めることができるため、アドレスコンフリク
ト、BC命令の高速化が実現できる。最近の大型
汎用計算機においては、システムプログラムを実
行した場合の、１命令当りの平均処理時間のう
ち、アドレスコンフリクト及びBC命令の分岐判
定によるものはそれぞれ10％程度と考えられる
が、本発明はこれらを大幅に削減する効果があ
り、大型汎用計算機の高速化に有効と考えられ
る。 According to the invention, like the system program
Even in programs where the majority of instructions have short calculation stages, such as Load and Add, the 2nd E unit can obtain the calculation results earlier than the 1st E unit, which eliminates address conflicts and increases the speed of BC instructions. realizable. In recent large-scale general-purpose computers, it is thought that address conflicts and BC instruction branch decisions each account for about 10% of the average processing time per instruction when executing a system program. It has the effect of significantly reducing the amount of time required, and is considered effective in speeding up large general-purpose computers.

[Brief explanation of drawings]

第１Ａ，１Ｂ図は本発明による実施例の異なる
部分のブロツク回路図、第２図は、デイジタルコ
ンピユータを構成するための第１Ａ図、第１Ｂ図
の配置を示す。第３Ａ，３Ｂ図はそれぞれ本発明
による実施例で用いる一つの命令フオーマツトの
例を示す。第４図は上記実施例で用いる第１の命
令制御ユニツトの概略回路構成図である。第５図
は上記実施例で用いる第１の演算実行ユニツトの
概略構成図である。第６図は上記実施例で用いる
第２の命令制御ユニツトの概略構成図である。第
７図は上記実施例で用いる第２の演算実行ユニツ
トの概略構成図である。第８図は上記実施例で用
いるアドレスコンフリクト検出回路の概略構成図
である。第９Ａから９Ｃ図は、上記実施例のアド
レスコンフリクトがある場合のタイムチヤートで
ある。第９Ｄ，９Ｅ図は先願と本実施例における
命令処理ステージをそれぞれ示す。第１０図は上
記実施例で用いる条件コードコンフリクト検出回
路の概略構成図を示す。第１１Ａと１１Ｂ図は、
上記実施例の、条件コードコンフリクトがある場
合のタイムチヤートである。 1A and 1B are block circuit diagrams of different parts of an embodiment of the present invention, and FIG. 2 shows the arrangement of FIGS. 1A and 1B for constructing a digital computer. 3A and 3B each illustrate one example of an instruction format used in an embodiment of the present invention. FIG. 4 is a schematic circuit diagram of the first instruction control unit used in the above embodiment. FIG. 5 is a schematic diagram of the first arithmetic execution unit used in the above embodiment. FIG. 6 is a schematic diagram of the second instruction control unit used in the above embodiment. FIG. 7 is a schematic diagram of the second arithmetic execution unit used in the above embodiment. FIG. 8 is a schematic diagram of the address conflict detection circuit used in the above embodiment. 9A to 9C are time charts when there is an address conflict in the above embodiment. 9D and 9E show the instruction processing stages in the prior application and this embodiment, respectively. FIG. 10 shows a schematic configuration diagram of a condition code conflict detection circuit used in the above embodiment. Figures 11A and 11B are
This is a time chart when there is a condition code conflict in the above embodiment.

Claims

[Claims] 1. A data processing device that divides each instruction to be executed into a plurality of stages and executes them in a pipeline mode, which is capable of executing a plurality of operations in a first time interval at the shortest. a first arithmetic device; a second arithmetic device capable of executing some of the plurality of arithmetic operations with a relatively short execution time in a second time interval smaller than the first time interval; means for decoding a plurality of instructions to be executed in chronological order so as to decode a subsequent instruction in parallel with the execution of the operation of the previously decoded instruction; means for temporarily holding a plurality of instructions; and a means for sequentially causing the first arithmetic unit to execute operations requested by each of the plurality of instructions held in the holding means at the first time interval at the shortest. a first instruction execution control means for instructing, the holding means in synchronization with the end of the operation being executed in the first arithmetic device and asynchronously with the end of the operation in the second arithmetic device; having means for instructing the execution of the next instruction held by the holding means; and among the plurality of instructions held by the holding means,
Select a part of a plurality of instructions that require operations that can be executed by the second arithmetic unit, and have the second arithmetic unit execute the operations required by each instruction within the second time interval at the shortest. a second instruction execution control means operating in parallel with the first instruction execution control means that sequentially instructs the second instruction execution control means, the second instruction execution control means operating in parallel with the first instruction execution control means; , asynchronously with the completion of the operation by the first arithmetic unit, instructing the execution of the operation of the next instruction that requests an operation that can be executed by the second arithmetic unit, among the instructions held in the holding means. 1. A data processing device having means for: 2. Among the plurality of instructions to be executed, in response to the plurality of instructions using memory operands in the calculation stage being decoded by the decoding means, The first instruction execution control means further comprises means for generating an instruction at a time interval, when the first instruction execution control means instructs the first arithmetic unit to execute an instruction to be executed next, The second instruction execution control means supplies the memory operand generated by the generation means or the register operand specified by the instruction to the first arithmetic unit, and the second instruction execution control means causes the second arithmetic unit to perform the next operation. 2. The second arithmetic unit according to claim 1, wherein when instructing execution of an instruction to be executed, a memory operand generated by the generating means for the instruction or a register operand specified by the instruction is supplied to the second arithmetic unit. Data processing equipment. 3. The data processing apparatus according to claim 1 or 2, wherein the first time interval is a multiple of the second time interval. 4 the first time interval is 2 machine cycles;
4. The data processing apparatus of claim 3, wherein the second time interval is one machine cycle. 5. A data processing device that divides each instruction to be executed into a plurality of stages and executes them in a pipeline mode, the first arithmetic device being capable of executing a plurality of operations in a first time interval at the shortest; , a second arithmetic unit capable of executing some of the plurality of operations with a relatively short execution time in a second time interval smaller than the first time interval; and the previously decoded instruction. means for chronologically decoding a plurality of instructions to be executed so as to decode subsequent instructions in parallel with the execution of an operation; a first instruction that sequentially instructs the first arithmetic unit to execute the operation requested by each of the plurality of instructions held by the holding means, at the first time interval at the shortest; Execution control means, in synchronization with the end of the operation being executed in the first arithmetic device and asynchronously with the end of the operation in the second arithmetic device, the next one held in the holding device. one having a means for instructing the execution of an operation of an instruction; and among the plurality of instructions held by the holding means,
The execution of some operations required by some of the plurality of instructions that request operations that can be executed by the second arithmetic unit is performed sequentially on the second arithmetic unit at the second time interval at the shortest. a second instruction execution control means for instructing, the holding means in synchronization with the end of the operation being executed in the second arithmetic device and asynchronously with the end of the operation in the first arithmetic device; Among the instructions held in the second arithmetic unit, one has means for instructing the execution of the next instruction that requires an operation executable by the second arithmetic unit, and the plurality of instructions decoded by the decoding means are For instructions that use memory operands in the arithmetic stage, means for calculating the addresses of the memory operands required by each instruction and reading those memory operands at the earliest in a second time interval; Either one of the first
It is determined whether or not the operation result of the preceding instruction being executed or waiting to be executed is needed in the arithmetic unit or the second arithmetic unit, and when the result of the operation is needed, the result of the operation is transferred to the first arithmetic unit. and means for inhibiting the reading means from generating an address of a memory operand for an instruction until it becomes usable by the arithmetic unit or the second arithmetic unit. 6. The data processing apparatus according to claim 5, wherein the first time interval is a multiple of the second time interval. 7 the first time interval is 2 machine cycles;
7. The data processing apparatus of claim 6, wherein the second time interval is one machine cycle. 8. A first arithmetic device that is a data processing device that divides each instruction to be executed into a plurality of stages and executes them in a pipeline mode, and that is capable of executing a plurality of operations in a first time interval at the shortest; a second arithmetic unit capable of executing some of the plurality of operations with a relatively short execution time in a second time interval smaller than the second time interval; means for chronologically decoding a plurality of instructions to be executed so as to decode subsequent instructions in parallel with the execution of an operation; holding means; and first instruction execution for sequentially instructing the first arithmetic unit to execute operations requested by each of the plurality of instructions held by the holding means, at the first time interval at the shortest. A control means that executes the next instruction held in the holding means in synchronization with the end of the operation being executed in the first arithmetic device and asynchronously with the end of the operation in the second arithmetic device. Among the plurality of instructions held by the holding means,
The execution of some operations required by some of the plurality of instructions that request operations that can be executed by the second arithmetic unit is performed sequentially on the second arithmetic unit at the second time interval at the shortest. a second instruction execution control means for instructing, the holding means in synchronization with the end of the operation being executed in the second arithmetic device and asynchronously with the end of the operation in the first arithmetic device; Among the instructions held in the second arithmetic unit, the instructions decoded by the decoding means have means for instructing the execution of the next instruction that requires an operation executable by the second arithmetic unit, and the instruction decoded by the decoding means means for determining whether or not the branch is successful based on the value of a condition code for branch determination held in the data processing device when a branch instruction is executed; means for calculating the address of a branch destination instruction and reading the branch destination instruction; It is determined whether the condition code is rewritten by the operation of the instruction, and if it is determined that the condition code is rewritten by the operation of one of the preceding instructions, the condition code is rewritten by the operation of one of the preceding instructions. and means for inhibiting the reading means from generating an address for a branch destination instruction for the branch instruction until a condition code for the branch instruction becomes available. 9. Among the plurality of instructions to be executed, in response to the plurality of instructions using memory operands in the calculation stage being decoded by the decoding means, The first instruction execution control means further includes means for generating the instruction at a time interval, and when instructing the first arithmetic unit to execute an instruction to be executed next, the first instruction execution control means generates a The second instruction execution control means supplies the memory operand generated by the generation means or the register operand specified by the instruction to the first arithmetic unit, and the second instruction execution control means causes the second arithmetic unit to perform the next operation. 9. When instructing the execution of an instruction to be executed, the memory operand generated by the generating means for the instruction or the register operand specified by the instruction is supplied to the second arithmetic unit. Data processing equipment. 10. The data processing apparatus according to claim 8 or 9, wherein the first time interval is a multiple of the second time interval. 11. The data processing apparatus according to claim 10, wherein the first time interval is two machine cycles and the second time interval is one machine cycle.