JPH08115216A

JPH08115216A - A computer using a storage device with an address addition function

Info

Publication number: JPH08115216A
Application number: JP25037694A
Authority: JP
Inventors: Satoru Kokuni; 哲小國
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1994-10-17
Filing date: 1994-10-17
Publication date: 1996-05-07

Abstract

PURPOSE: To prevent a pipeline stall from occurring even in the case of requiring data read out of a 4 memory by a loading instruction in an instruction immediately after the instruction by providing a storage device with an addition function. CONSTITUTION: For the read address and write address of a data storage device 220, inputted two addresses, that are an index X and a base B, are inputted, the address obtained as the result of adding the inputted two addresses is used and the data are read out or written. Thus, a pipeline stage is constituted of three stages of ID, EX and WB. In an EX stage, two pieces of the data set in a register 205 are used as the address in the loading instruction for accessing the memory or the like and the data are read out of the data storage device 220 and set in the register 305. Then, in a WB stage, since the result of the loading instruction is set in the register 305, even in the case of using it in the instruction immediately after, the pipe line stall does not occur.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は記憶装置を有する計算機
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a computer having a storage device.

【０００２】[0002]

【従来の技術】計算機のパイプラインに関する従来の技
術について、文献モルガンカウフマン（Morgan Kau
fmann Publishers）社発行のヘネシー（John L. Hennes
sy）他著「コンピュータオーガニゼーションアンド
デザインザハードウェア／ソフトウェアインタ
フェース（COMPUTER ORGANIZATION AND DESIGN THE H
ARDWARE／SOFTWARE INTERFACE）」の第６章（364項〜45
1項）を参考して説明する。2. Description of the Related Art Regarding the conventional technology relating to the pipeline of a computer, reference is made to Morgan Kau.
f. Lmann Hennes issued by fmann Publishers
sy) et al. “Computer Organization and Design the Hardware / Software Interface (COMPUTER ORGANIZATION AND DESIGN THE H
ARDWARE / SOFTWARE INTERFACE) Chapter 6 (364-45)
Explain with reference to (1).

【０００３】従来の計算機システムで用いられているパ
イプラインのステージ構成の典型的な例を図７に示す。
この図では、１００，２００，３００及び４００はパイ
プラインレジスタ（以下では単にレジスタとする）で、
これにより４個のパイプラインステージＩＤ，ＥＸ，Ｍ
ＥＭ，ＷＢに区切られている。FIG. 7 shows a typical example of the stage configuration of a pipeline used in a conventional computer system.
In this figure, 100, 200, 300 and 400 are pipeline registers (hereinafter simply referred to as registers),
As a result, four pipeline stages ID, EX, M
It is divided into EM and WB.

【０００４】ＩＤステージでは、レジスタ１００には命
令がセットされており、命令の内容の示すレジスタ番号
により、レジスタファイル１１０からレジスタの内容
（データ）が読み出され、レジスタ２００にセットされ
る。At the ID stage, an instruction is set in the register 100, and the content (data) of the register is read from the register file 110 according to the register number indicated by the content of the instruction and set in the register 200.

【０００５】ＥＸステージでは、レジスタ２００にセッ
トされたデータと命令の内容に従って、ALU210では演算
が行われ、演算結果はレジスタ３００にセットされる。
また、演算を行わず、メモリアクセスも行わない命令
（レジスタ間コピー命令など）では、レジスタ２００の
データがそのままレジスタ３００にセットされる。At the EX stage, the ALU 210 performs an operation according to the contents of the data and the instruction set in the register 200, and the operation result is set in the register 300.
Further, in the case of an instruction that does not perform an operation and does not access a memory (such as an inter-register copy instruction), the data in the register 200 is set in the register 300 as it is.

【０００６】ＭＥＭステージでは、レジスタ３００にセ
ットされたデータは、メモリにアクセスするロード命令
等ではアドレスとして用いられ、データメモリ３１０か
らはデータが読み出されレジスタ４００に送られ、ま
た、メモリにアクセスしないような演算命令等では、レ
ジスタ３００にセットされたデータが直接レジスタ400
に送られる。こうして、レジスタ４００にデータがセッ
トされる。In the MEM stage, the data set in the register 300 is used as an address in a load instruction for accessing the memory, the data is read from the data memory 310 and sent to the register 400, and the memory is accessed. In the case of calculation instructions that do not occur, the data set in the register 300 is directly stored in the register 400.
Sent to Thus, the data is set in the register 400.

【０００７】そして、ＷＢステージでは、レジスタ４０
０にセットされたデータはマルチプレクサ（ＭＵＸ）４
１０でセレクトされてレジスタファイル１１０に書き込
まれる。In the WB stage, the register 40
The data set to 0 is the multiplexer (MUX) 4
It is selected in 10 and written in the register file 110.

【０００８】図７に示したパイプラインの動作の例とし
て、図１２にタイミングチャートを示す。図１２では、
命令列として、 Add ｒ４ｒ５ｒ６(レジスタ５の内容とレジスタ６
の内容を加算し、その結果をレジスタ４に格納する） Load ｒ２ｒ７ｒ８（レジスタ７の内容とレジスタ
８の内容を加算し、その結果得られるアドレスの指すメ
モリ上の位置からデータをロードし、レジスタ２に格納
する） Add ｒ１ｒ２ｒ３(レジスタ２の内容とレジスタ３
の内容を加算し、その結果をレジスタ１に格納する）を実行している場合を示している。As an example of the operation of the pipeline shown in FIG. 7, a timing chart is shown in FIG. In FIG.
As an instruction sequence, Add r4 r5 r6 (contents of register 5 and register 6
Add the contents of, and store the result in register 4) Load r2 r7 r8 (add the contents of register 7 and the contents of register 8 and load the data from the memory location pointed to by the resulting address, Store in register 2) Add r1 r2 r3 (contents of register 2 and register 3
Is added and the result is stored in the register 1).

【０００９】図７では、図に示した○は、各命令が各時
間においてどのステージの処理中であるかを示してい
る。１番目の命令と２番目の命令は、データの依存関係
が存在しないため、すべてのステージが１サイクルずつ
かけて処理される。しかし、３番目の命令は、２番目の
命令でロードしたデータをＥＸステージで必要（つまり
データの依存関係がある）であるが、２番目の命令でデ
ータを得られるのは、ＭＥＭステージの終了時点である
ので、３番目の命令のＥＸステージは、２番目の命令の
ＭＥＭステージの終了を待ってから処理されなくてはな
らない。In FIG. 7, ◯ shown in the drawing indicates which stage each instruction is being processed at each time. Since the first instruction and the second instruction have no data dependency, all stages are processed one cycle at a time. However, the third instruction requires the data loaded by the second instruction at the EX stage (that is, there is a data dependency), but the second instruction can obtain the data at the end of the MEM stage. Since it is time, the EX stage of the third instruction must be processed after waiting for the end of the MEM stage of the second instruction.

【００１０】また、分岐命令の分岐先の命令（ここでは
ターゲット命令とよぶ）の読み出しのパイプラインを図
８に示す。ＩＤステージでは、分岐命令の内容の示すレ
ジスタ番号により、レジスタファイル１１０からレジス
タの内容（データ）が読み出され、レジスタ２００にセ
ットされる。ＥＸステージでは、レジスタ２００にセッ
トされたデータを用いて、ALU210では加算が行われ、加
算結果はレジスタ300にセットされる。ＭＥＭステージ
では、レジスタ３００にセットされたデータは、メモリ
からターゲット命令を読み出すアドレスとして用いら
れ、命令メモリ３２０からはターゲット命令が読み出さ
れ、読み出された命令はレジスタ１００に書き込まれ、
次に実行される命令となる。但し、分岐命令がある条件
の成立時のみ分岐を行う場合には、条件の判定結果に応
じ、ターゲット命令もしくは、その分岐命令にアドレス
で連続する次命令（単に次命令とする）のどちらかが選
択される。Further, FIG. 8 shows a pipeline for reading an instruction (herein referred to as a target instruction) at a branch destination of the branch instruction. In the ID stage, the register contents (data) are read from the register file 110 and set in the register 200 according to the register number indicated by the contents of the branch instruction. In the EX stage, the ALU 210 performs addition using the data set in the register 200, and the addition result is set in the register 300. In the MEM stage, the data set in the register 300 is used as an address for reading the target instruction from the memory, the target instruction is read from the instruction memory 320, and the read instruction is written in the register 100.
The next instruction will be executed. However, if a branch instruction branches only when a certain condition is satisfied, either the target instruction or the next instruction that continues at the address of the branch instruction (simply the next instruction) is determined according to the result of the determination of the condition. To be selected.

【００１１】但し、次命令も命令メモリ３２０から読み
出されるのであるが、ここでは、分岐命令の実行に先立
って命令メモリ３２０から読み出されているものとして
いる。これは、命令メモリ３２０の読み出し幅を１個の
命令の長さの２倍以上にして、１回の読み出しに付き２
個以上の命令を読み出すことで可能となる。これは今日
の計算機の多くが用いている方式である。However, although the next instruction is also read from the instruction memory 320, it is assumed here that it is read from the instruction memory 320 prior to the execution of the branch instruction. This is because the read width of the instruction memory 320 is twice or more the length of one instruction, and 2 times per one read.
This is possible by reading out more than one instruction. This is the method used by many of today's computers.

【００１２】この図８に示したパイプラインの動作の例
として、図１３にタイミングチャートを示す。図１３で
は、命令列として、 Add ｒ４ｒ５ｒ６(レジスタ５の内容とレジスタ６
の内容を加算し、その結果をレジスタ４に格納する） Branch ｒ７ｒ８（条件により、レジスタ７の内容と
レジスタ８の内容を加算し、その結果得られるアドレス
の指すメモリ上の位置にある命令へ分岐する） Add ｒ１ｒ２ｒ３(レジスタ２の内容とレジスタ３
の内容を加算し、その結果をレジスタ１に格納する）を実行している場合を示している。１番目の命令と２番
目の命令は、すべてのステージが１サイクルずつずれて
処理される。しかし、３番目の命令は、２番目の命令で
読み出した命令であるが、２番目の命令で命令を得られ
るのは、ＭＥＭステージの終了時点であるので、３番目
の命令のＩＤステージは、２番目の命令のＭＥＭステー
ジの終了を待ってから処理されなくてはならない。As an example of the operation of the pipeline shown in FIG. 8, a timing chart is shown in FIG. In FIG. 13, as an instruction sequence, Add r4 r5 r6 (contents of register 5 and register 6
Branch r7 r8 (According to the condition, the contents of register 7 and the contents of register 8 are added, and the instruction at the memory position pointed to by the resulting address is added. Add r1 r2 r3 (register 2 contents and register 3
Is added and the result is stored in the register 1). All the stages of the first instruction and the second instruction are processed by shifting by one cycle. However, although the third instruction is the instruction read by the second instruction, the instruction can be obtained by the second instruction at the end of the MEM stage, so the ID stage of the third instruction is It must be processed after waiting for the end of the MEM stage of the second instruction.

【００１３】ところで、図７と図８で示したメモリは、
アドレスを入力すると、そのアドレスの指す部分からデ
ータを読み出したり、あるいは、そのアドレスの指す部
分にデータを書き込むといったメモリである。しかし、
多くの場合、図７と図８で示したメモリは、キャッシュ
記憶装置になっている。キャッシュ記憶装置は、アドレ
ス変換用バッファ，ディレクトリ，データアレイ等の複
数のメモリから構成される。By the way, the memory shown in FIG. 7 and FIG.
It is a memory that, when an address is input, reads data from the portion indicated by the address, or writes data in the portion indicated by the address. But,
In many cases, the memory shown in FIGS. 7 and 8 is a cache storage device. The cache storage device is composed of a plurality of memories such as an address conversion buffer, a directory, and a data array.

【００１４】例として、図１５には、図７のメモリ３１
０がキャッシュ記憶装置であるとし、その内部のメモリ
を示した。ここでは、キャッシュ記憶装置は、データア
レイ３１１，選択回路３１１，ディレクトリ３１３、及
び、ヒット検出回路３１４からなっている（場合により
アドレス変換バッファのある場合がある）。ディレクト
リには、データアレイに格納されているデータのアドレ
ス等を含むエントリが格納されており、キャッシュ記憶
装置からデータを読み出す場合の動作を以下で説明す
る。As an example, FIG. 15 shows the memory 31 of FIG.
It is assumed that 0 is a cache memory device and its internal memory is shown. Here, the cache memory device comprises a data array 311, a selection circuit 311, a directory 313, and a hit detection circuit 314 (there may be an address translation buffer depending on the case). An entry including an address of data stored in the data array is stored in the directory, and an operation for reading data from the cache storage device will be described below.

【００１５】データアレイからはデータを読み出し、デ
ィレクトリ３１３からはそのデータに対応するエントリ
を読み出す。ヒット検出回路３１４では、ディレクトリ
313から読み出されたエントリとパイプラインレジスタ
３００にセットされている読み出しアドレスとから、デ
ータアレイ３１１から読み出したデータ中に、所望のデ
ータがあるかどうかを調べる。その結果、所望のデータ
があれば、選択回路３１２で、データアレイ３１１から
読み出したデータから、所望のデータを選択し、パイプ
ラインレジスタ４００にそのデータがセットされる。所
望のデータがない場合には、主記憶装置よりデータアレ
イ３１１にデータを転送する。Data is read from the data array, and the entry corresponding to the data is read from the directory 313. In the hit detection circuit 314, the directory
Based on the entry read from 313 and the read address set in the pipeline register 300, it is checked whether or not there is desired data in the data read from the data array 311. As a result, if there is desired data, the selection circuit 312 selects the desired data from the data read from the data array 311 and sets the data in the pipeline register 400. If there is no desired data, the data is transferred from the main storage device to the data array 311.

【００１６】図１５に示したパイプライン構成におい
て、各ステージのディレイを比べた時に、ＭＥＭステー
ジが最も大きくなる場合が多く、ＭＥＭステージにおい
て、図１５から推定されるように、ディレクトリ３１３
を読み出すルートが最もディレイが大きくなる場合が多
い。したがって、場合によっては、マシンサイクルを高
速にするために、ＭＥＭステージを２ステージに分割す
ることがある。しかし、こうすることで上記したロード
命令の結果を後続命令がオペランドとして用いる場合、
後続命令はさらに１ステージ実行が遅らされる。In the pipeline configuration shown in FIG. 15, the MEM stage is often the largest when the delays of the respective stages are compared. In the MEM stage, the directory 313 is estimated as estimated from FIG.
In many cases, the route for reading is the largest delay. Therefore, in some cases, the MEM stage may be divided into two stages in order to speed up the machine cycle. However, by doing this, if the subsequent instruction uses the result of the above load instruction as an operand,
Subsequent instructions are delayed by one more stage.

【００１７】次に、データメモリ３１０及び命令メモリ
３２０の回路構成に関する従来の技術を、丸善株式会社
発行の富沢孝及び松山泰男監訳「CMOS VLSI 設計の原理
システムの視点から」の８.５ランダムアクセスメモ
リ（３０６項〜３２５項）を参考にして説明する。Next, the conventional technique relating to the circuit configuration of the data memory 310 and the instruction memory 320 is described in 8.5 Random Access of "Principle of CMOS VLSI Design from the System Viewpoint" translated by Takashi Tomizawa and Yasuo Matsuyama, published by Maruzen Co. Description will be made with reference to the memory (sections 306 to 325).

【００１８】データメモリ３１０及び命令メモリ３２０
の回路構成の典型的な例を図９に示す。ここでのメモリ
は、アドレスを入力すると、そのアドレスの指す部分か
らデータを読み出したり、あるいは、そのアドレスの指
す部分にデータを書き込むといったメモリを考えてい
る。また、メモリは、行デコーダ５００，メモリアレイ
６００，列デコーダ７００及びマルチプレクサ８００か
ら構成され、それぞれの間に示す矢印は信号の進む方向
を示し、矢印に付与した数字は信号本数を示す。Data memory 310 and instruction memory 320
FIG. 9 shows a typical example of the circuit configuration of FIG. As the memory here, a memory is considered in which, when an address is input, data is read from a portion indicated by the address, or data is written in a portion indicated by the address. The memory is composed of a row decoder 500, a memory array 600, a column decoder 700, and a multiplexer 800. The arrows between them indicate the direction in which signals travel, and the numbers given to the arrows indicate the number of signals.

【００１９】まず、読み出しの動作について述べる。読
み出しアドレスは、まず上位，中位及び下位の３個に分
けられ、この内、中位のみ読み出しに用いられる。読み
出しアドレスの中位は、さらに上位と下位の２個に分け
られ、その内の一方が行アドレスとなり、他方が列アド
レスになる。これらのアドレスは、それぞれ行デコーダ
５００及び列デコーダ７００でデコードされる。すなわ
ち、Ｍ本からなる行アドレスがデコードされることによ
り、「２のＭ乗」本の内の、唯１本の信号のみ１とな
り、他の信号は０となる。First, the read operation will be described. The read address is first divided into upper, middle and lower three addresses, and only the middle is used for reading. The middle of the read address is further divided into upper and lower two, one of which is a row address and the other is a column address. These addresses are decoded by the row decoder 500 and the column decoder 700, respectively. That is, by decoding the M row address, only one signal of the "2 to the Mth power" becomes 1 and the other signals become 0.

【００２０】メモリアレイ内には、１ビットの信号を記
憶するメモリセルが、格子状に、縦（列方向）と横（行
方向）に「２のＭ乗個」×「２のＮ乗」個並べられてお
り、行デコーダ５００からの入力信号の内、１となって
いる信号の示す１行が読み出され、マルチプレクサ８０
０へ出力される。一方、列デコーダ７００でも、行デコ
ーダ５００と同様のデコードが行われ、マルチプレクサ
８００ではメモリアレイ６００の出力の１行「２のＮ
乗」本の信号の内の１列「２のＫ乗」個の信号が選ば
れ、これが読み出しデータとして出力される。In the memory array, memory cells for storing a 1-bit signal are arranged in a lattice form in the vertical (column direction) and the horizontal (row direction) "2 M powers" x "2 N power". Among the input signals from the row decoder 500, one row indicated by the signal 1 is read out, and the multiplexer 80
Output to 0. On the other hand, the column decoder 700 also performs the same decoding as the row decoder 500, and the multiplexer 800 outputs one row “N of 2” of the output of the memory array 600.
Among the "squared" signals, one column of "2 to the Kth power" signals are selected and output as read data.

【００２１】次に書き込みの動作について述べる。読み
出しアドレスと同様に、書き込みアドレスの中位が、行
アドレスと列アドレスになり、それぞれ行デコーダ５０
０及び列デコーダ７００でデコードされる。書き込みデ
ータは、マルチプレクサ800に入力され、マルチプレク
サ８００からメモリアレイ６００への出力信号「２のＮ
乗」本の内、列デコーダ７００の出力信号「２の（Ｎ−
Ｋ）乗」本の内１となっている信号に対応する１列「２
のＫ乗」本の信号にデータが乗せられ、それ以外の信号
はハイインピーダンスの状態となる。そして、マルチプ
レクサ８００からメモリアレイ６００への出力信号の唯
１列に乗せられたデータは行デコーダ５００の出力信号
の内、１となっている信号に対応する１行「２のＮ乗」
個のメモリセルの内の１列「２のＫ乗」個のメモリセル
に書き込まれる。Next, the write operation will be described. Similar to the read address, the middle of the write address becomes the row address and the column address, and the row decoder 50
It is decoded by the 0 and column decoder 700. The write data is input to the multiplexer 800, and the output signal “2 of N” is output from the multiplexer 800 to the memory array 600.
The output signal “2 (N−
K) square ”, one column“ 2 ”corresponding to the signal that is 1 in the book
Data is added to the "Kth power of" signals, and the other signals are in a high impedance state. Then, the data placed in only one column of the output signal from the multiplexer 800 to the memory array 600 corresponds to one row of the output signals of the row decoder 500 that corresponds to the signal that is 1, "2 to the Nth power".
Of the memory cells, data is written in "2 to the power of K" memory cells in one column.

【００２２】ところで、行デコーダ５００の出力は、２
本以上の信号が１であってはならない。そうでなけれ
ば、メモリアレイに記憶されているデータが破壊される
ことがある。しかし、実動作では、毎サイクル異なるア
ドレスがメモリに入力されるために、前のサイクルに入
力されたアドレスをデコードした結果と、当該サイクル
に入力されたアドレスをデコードした結果とが異なる場
合には、行デコーダ500の出力は変化する。このため、
ある程度の遷移時間が存在し、この間は、出力信号の内
の２本以上の信号が１になり得るわけであるが、この遷
移時間がある一定以下であればデータの破壊を防ぐこと
ができる。したがって、メモリの設計上は、この遷移時
間を、メモリセル６００中のデータが破壊されない程度
の短い時間にしなくてはならない。By the way, the output of the row decoder 500 is 2
No more than one signal should be one. Otherwise, the data stored in the memory array may be destroyed. However, in the actual operation, since a different address is input to the memory every cycle, if the result of decoding the address input in the previous cycle and the result of decoding the address input in the relevant cycle are different, , The output of the row decoder 500 changes. For this reason,
There is a certain transition time, and during this period, two or more of the output signals can be 1, but if this transition time is less than a certain value, data destruction can be prevented. Therefore, in the design of the memory, this transition time should be set to a short time such that the data in the memory cell 600 is not destroyed.

【００２３】また、書き込み動作では、マルチプレクサ
８００からメモリアレイ６００に出力する信号の内、書
き込もうとしている列のみにデータを乗せ、それ以外の
信号はハイインピーダンスにするので、列デコーダ７０
０の出力信号の遷移時間では、別の列がハイインピーダ
ンスにならないためにデータが破壊されることがある。
したがって、行デコーダ５００と同様に、列デコーダ７
００の出力信号の遷移時間をデータが破壊されない程度
に短い時間にすることも可能である。また、列デコーダ
７００の出力信号の遷移時間中は、メモリアレイ６００
へ書き込みデータが入力される信号線を全てハイインピ
ーダンスにしておくことなどにより、列デコーダ７００
の出力信号の遷移時間の設計上の制約を取り除くことも
可能である。Further, in the write operation, among the signals output from the multiplexer 800 to the memory array 600, the data is put only in the column to be written, and the other signals are set to high impedance. Therefore, the column decoder 70 is used.
At a zero output signal transition time, data can be corrupted because another column does not go high impedance.
Therefore, like the row decoder 500, the column decoder 7
It is also possible to make the transition time of the 00 output signal as short as possible so that the data is not destroyed. Also, during the transition time of the output signal of the column decoder 700, the memory array 600 is
For example, by setting all the signal lines to which write data is input to high impedance to the column decoder 700
It is also possible to remove the design constraint of the transition time of the output signal of the.

【００２４】[0024]

【発明が解決しようとする課題】従来の技術で述べたパ
イプラインでは、演算命令の結果が得られるステージに
比べ、ロード命令の結果の得られるステージは、１サイ
クル遅れる。このため、ロード命令によりメモリから読
み出したデータを、その命令の直後の命令で必要とする
場合、パイプラインストールが起こされ、性能が低下す
る。また、分岐命令を実行する際に、ターゲット命令を
読み出してくるまでに相当のサイクル数を要し、やはり
性能が低下する。In the pipeline described in the prior art, the stage in which the result of the load instruction is obtained is delayed by one cycle as compared with the stage in which the result of the operation instruction is obtained. Therefore, when the data read from the memory by the load instruction is required by the instruction immediately after the instruction, pipeline stall occurs and the performance deteriorates. Further, when executing a branch instruction, a considerable number of cycles are required until the target instruction is read out, which also deteriorates the performance.

【００２５】あるいは、多くのプロセッサにおいて、キ
ャッシュ記憶装置のディレクトリを読み出すルートのデ
ィレイが遅く、マシンサイクルが遅くなってしまう、あ
るいは、ステージを分割することにより方式性能が低下
してしまう。Alternatively, in many processors, the delay of the route for reading the directory of the cache storage device is slow and the machine cycle becomes slow, or the performance of the system is degraded by dividing the stage.

【００２６】また、パイプラインでのALU210での加算と
データメモリ３１０あるいは命令メモリ３２０へのアク
セスを１サイクルで行うと、ディレイが増大してマシン
サイクル時間を増大させるため、性能が低下する。ま
た、メモリに図９に示したランダムアクセスメモリを用
いていた場合には、行デコーダ５００で特に問題となる
出力信号の遷移時間の設計上の制約を達成することがで
きない。Further, if the addition in the ALU 210 and the access to the data memory 310 or the instruction memory 320 in the pipeline are performed in one cycle, the delay increases and the machine cycle time increases, so that the performance deteriorates. Further, when the random access memory shown in FIG. 9 is used as the memory, the row decoder 500 cannot achieve the design constraint of the transition time of the output signal, which is a particular problem.

【００２７】本発明の目的は、パイプライン構成を改善
し、演算命令の結果の得られるステージと、ロード命令
の結果の得られるステージを同一にし、ロード命令によ
りメモリから読み出したデータを、その命令の直後の命
令で必要とする場合にも、パイプラインストールが起こ
される必要をなくすことである。また、本発明の別の目
的は、分岐命令を実行する際に、ターゲット命令を読み
出してくるまでに要するサイクル数を削減することであ
る。An object of the present invention is to improve the pipeline structure so that the stage where the result of the operation instruction is obtained and the stage where the result of the load instruction are obtained are the same, and the data read from the memory by the load instruction is stored in the instruction. The need to have a pipeline stall even if required by the instruction immediately after is. Another object of the present invention is to reduce the number of cycles required to read a target instruction when executing a branch instruction.

【００２８】[0028]

【課題を解決するための手段】本発明によれば、パイプ
ラインレジスタからの出力の２個のデータを入力し、そ
れらを加算した結果得られるアドレスにアクセスする記
憶装置を有し、記憶装置から読み出されたデータがパイ
プラインレジスタにセットされることを特徴とする計算
機が提供される。また、本発明によれば、パイプライン
レジスタからの出力の２個のデータを入力とし、それら
を加算した結果得られるアドレスにアクセスする記憶装
置を有し、記憶装置から読み出された命令がパイプライ
ンレジスタにセットされる計算機が提供される。According to the present invention, there is provided a storage device for inputting two pieces of data output from a pipeline register and accessing an address obtained as a result of adding them. A computer is provided which is characterized in that the read data is set in a pipeline register. Further, according to the present invention, the storage device has a storage device for accessing two addresses output from the pipeline register and accessing an address obtained as a result of adding the two data, and the instruction read from the storage device has a pipeline. A computer is provided that is set in the line register.

【００２９】あるいは、パイプラインレジスタからの出
力の２個のデータを入力し、それらを加算した結果得ら
れるアドレスにアクセスするキャッシュ記憶装置のディ
レクトリを有する計算機が提供される。Alternatively, there is provided a computer having a directory of a cache storage device for inputting two pieces of data output from the pipeline register and accessing an address obtained as a result of adding them.

【００３０】また、さらに本発明によれば、２個の行ア
ドレスをデコードする行デコーダを２個有し、一方の行
デコーダの出力信号を他方の行デコーダの出力信号によ
りシフトされるバレルシフタを有し、バレルシフタの出
力信号が入力信号となりアクセスする行が制御され、デ
ータの入出力を行う信号線を有するメモリアレイを有す
ることを特徴とする記憶装置を有する計算機が提供され
る。また、行アドレス信号線の１本が活性化されること
により、その行アドレスに対応する１行のすべてのメモ
リセルのスイッチとその行アドレスに１を加算したアド
レスに対応する１行の内、列アドレスが０であるメモリ
セルのスイッチのみがオンになり、データ入出力信号線
と接続されるメモリセルを有する記憶装置を有する計算
機が提供される。Further, according to the present invention, two row decoders for decoding two row addresses are provided, and a barrel shifter for shifting the output signal of one row decoder by the output signal of the other row decoder is provided. Then, a computer having a memory device is provided, which has a memory array having a signal line for inputting / outputting data, in which a row to be accessed is controlled by using an output signal of the barrel shifter as an input signal. Further, by activating one of the row address signal lines, the switches of all the memory cells in one row corresponding to the row address and the one row corresponding to the address obtained by adding 1 to the row address, Only the switch of the memory cell whose column address is 0 is turned on, and the computer having the memory device having the memory cell connected to the data input / output signal line is provided.

【００３１】[0031]

【作用】本発明によれば、アドレス加算と記憶装置への
アクセスとを１サイクルで行うことができるので、メモ
リへのアクセスにおけるパイプラインステージを削減す
ることができ、ロード命令を演算命令と同一のサイクル
で結果を得ることができ、後続の命令で結果を必要とす
る場合にパイプラインストールを起こすことを回避する
ことができ、計算機の性能を向上させることができる。
また、分岐命令におけるターゲット命令の読み出しを高
速に行うことが可能となり、やはり計算機の性能を向上
させることができる。According to the present invention, the address addition and the access to the storage device can be performed in one cycle, so that the pipeline stages in the access to the memory can be reduced and the load instruction can be the same as the operation instruction. The result can be obtained in the cycle of, and the pipeline stall can be avoided when the result is required by the subsequent instruction, and the performance of the computer can be improved.
Further, the target instruction in the branch instruction can be read at high speed, and the performance of the computer can be improved.

【００３２】あるいは、パイプラインレジスタからの出
力の２個のデータを入力し、それらを加算した結果得ら
れるアドレスにアクセスするキャッシュ記憶装置のディ
レクトリを用いることにより、パイプラインステージを
増加させることなく、ディレクトリからの読み出しと、
ヒット検出回路を２ステージに分割することができ、パ
イプラインステージを増大させることなく高速なマシン
サイクルを実現することができ、計算機の性能を向上さ
せることができる。Alternatively, by using the directory of the cache storage device for inputting the two data output from the pipeline register and accessing the address obtained as a result of adding them, without increasing the pipeline stages, Read from the directory,
The hit detection circuit can be divided into two stages, a high-speed machine cycle can be realized without increasing the pipeline stages, and the performance of the computer can be improved.

【００３３】また、本発明によれば、記憶装置において
アドレス加算を行デコードしてからシフト動作を行うこ
とにより加算を行うので、アドレス加算を高速化するこ
とができる。また、アドレス加算において、もっともデ
ィレイが大きい行アドレスの最上位ビットまでのキャリ
の伝搬を待たずしてメモリアレイにアクセスできるた
め、アドレス加算からメモリへのアクセスを高速化する
ことができる。これらにより、マシンサイクルを増大さ
せることなく、アドレス加算からメモリへのアクセスを
１サイクルで行うことが可能となり、記憶装置の設計上
問題となる遷移時間の制約を満足させることが可能とな
る。Further, according to the present invention, the address addition can be speeded up because the address addition is row-decoded in the memory device and then the shift operation is performed to perform the addition. In addition, in address addition, since the memory array can be accessed without waiting for the carry of the carry to the most significant bit of the row address having the largest delay, the access from the address addition to the memory can be speeded up. As a result, it is possible to access the memory from the address addition in one cycle without increasing the machine cycle, and it is possible to satisfy the constraint of the transition time which is a problem in designing the memory device.

【００３４】[0034]

【実施例】第１の実施例を図１を用いて説明する。本発
明で用いるデータ記憶装置２２０は読み出しアドレス及
び書き込みアドレスは、入力した２個のアドレス、すな
わち、インデックスＸおよびベースＢを入力し、入力さ
れた２個のアドレスを加算した結果得られるアドレスで
あり、これを用いてデータを読み出しあるいは書き込
む。これにより、従来の技術で述べたパイプライン（図
７）と同等のパイプラインが図１のようになる。すなわ
ち、パイプラインステージがＩＤ，ＥＸ，ＷＢの３ステ
ージから構成される。EXAMPLE A first example will be described with reference to FIG. In the data storage device 220 used in the present invention, the read address and the write address are the addresses obtained as a result of adding the two input addresses, that is, the index X and the base B, and adding the two input addresses. , Use this to read or write data. As a result, a pipeline equivalent to the pipeline described in the related art (FIG. 7) becomes as shown in FIG. That is, the pipeline stage is composed of three stages of ID, EX, and WB.

【００３５】ＩＤステージでは、レジスタ１０５には命
令がセットされており、命令の内容の示すレジスタ番号
により、レジスタ１１０からレジスタの内容（データ）
が読み出され、レジスタ２０５にセットされる。In the ID stage, an instruction is set in the register 105, and the register content (data) is transferred from the register 110 according to the register number indicated by the instruction content.
Is read out and set in the register 205.

【００３６】ＥＸステージでは、レジスタ２００にセッ
トされたデータと命令の内容に従って、ALU210では演算
が行われ、演算結果はレジスタ３０５にセットされる。In the EX stage, the ALU 210 performs an operation according to the data set in the register 200 and the content of the instruction, and the operation result is set in the register 305.

【００３７】また一方では、レジスタ２００にセットさ
れた２個のデータは、メモリにアクセスするロード命令
等ではアドレスとして用いられ、データ記憶装置２２０
からはデータが読み出されレジスタ３０５にセットされ
る。また、演算を行わず、メモリアクセスも行わない命
令（レジスタ間コピー命令など）では、レジスタ205の
データがそのままレジスタ３０５にセットされる。On the other hand, the two pieces of data set in the register 200 are used as addresses in a load instruction for accessing the memory, and the data storage device 220 is used.
The data is read from and is set in the register 305. Further, in the case of an instruction that does not perform an operation and does not perform a memory access (such as an inter-register copy instruction), the data in the register 205 is set in the register 305 as it is.

【００３８】そして、ＷＢステージでは、レジスタ３０
５にセットされたデータはマルチプレクサ３３０でセレ
クトされてレジスタ１１０に書き込まれる。ここでは、
ロード命令の結果もレジスタ３０５にセットされるた
め、直後の命令でそれを使用する場合でも、パイプライ
ンストールが起きない。Then, in the WB stage, the register 30
The data set to 5 is selected by the multiplexer 330 and written in the register 110. here,
Since the result of the load instruction is also set in the register 305, pipeline stall does not occur even if the instruction is used immediately thereafter.

【００３９】また、この図１に示したパイプラインの動
作の例として、図１０にタイミングチャートを示す。図
１０では、命令列として、 Add ｒ４ｒ５ｒ６(レジスタ５の内容とレジスタ６
の内容を加算し、その結果をレジスタ４に格納する） Load ｒ２ｒ７ｒ８（レジスタ７の内容とレジスタ
８の内容を加算し、その結果得られるアドレスの指すメ
モリ上の位置からデータをロードし、レジスタ２に格納
する） Add ｒ１ｒ２ｒ３(レジスタ２の内容とレジスタ３
の内容を加算し、その結果をレジスタ１に格納する）を実行している場合を示している。１番目の命令と２番
目の命令は、データの依存関係が存在しないため、すべ
てのステージが１サイクルずつかけて処理される。そし
て、３番目の命令は、２番目の命令でロードしたデータ
をＥＸステージで必要（つまりデータの依存関係があ
る）であるが、２番目の命令でデータを得られるのは、
ＥＸステージの終了時点であるので、３番目の命令のＥ
Ｘステージは、２番目の命令のＭＥＭステージと同時に
実行することができ、従来技術で述べたパイプラインス
トールを起こす必要はない。FIG. 10 shows a timing chart as an example of the operation of the pipeline shown in FIG. In FIG. 10, as an instruction sequence, Add r4 r5 r6 (contents of register 5 and register 6
Add the contents of, and store the result in register 4) Load r2 r7 r8 (add the contents of register 7 and the contents of register 8 and load the data from the memory location pointed to by the resulting address, Store in register 2) Add r1 r2 r3 (contents of register 2 and register 3
Is added and the result is stored in the register 1). Since the first instruction and the second instruction have no data dependency, all stages are processed one cycle at a time. The third instruction needs the data loaded by the second instruction at the EX stage (that is, there is a data dependency), but the second instruction can obtain the data.
Since it is the end time of the EX stage, the E of the third instruction
The X stage can be executed at the same time as the MEM stage of the second instruction, without the need for the pipeline stall described in the prior art.

【００４０】ところで、命令セットによっては、アドレ
ス加算を３個のアドレス、すなわち、Ｘ，Ｂとディスプ
レースメント（Ｄ）を加算することにより行う場合があ
る。この場合、３個のアドレスを、いわゆる、キャリセ
ーブアダーに通すことで、２個のアドレスに変換できる
ので、データ記憶装置２２０を上記と同様に用いること
ができる。なお、キャリセーブアダーは、わずかゲート
にして２段程度であるのでキャリセーブアダーに通すこ
とによるディレイへの影響は小さいので、図１に示した
パイプライン構成において、新たにステージを追加する
必要はなく、また、キャリセーブアダーによって３個の
アドレスを２個のアドレスに変換するのは、ＩＤステー
ジで行ってもよいし、ＥＸステージで行ってもよい。Depending on the instruction set, address addition may be performed by adding three addresses, that is, X, B and displacement (D). In this case, since the three addresses can be converted into two addresses by passing the so-called carry save adder, the data storage device 220 can be used in the same manner as above. Since the carry-save adder has only about two gates, the effect on the delay caused by passing through the carry-save adder is small. Therefore, in the pipeline configuration shown in FIG. 1, it is not necessary to add a new stage. Alternatively, the conversion of three addresses into two addresses by the carry save adder may be performed in the ID stage or the EX stage.

【００４１】次に、第２の実施例を図２を用いて説明す
る。ここでも、第１の実施例で用いたメモリと同じよう
に、２個のアドレスＢ，Ｘを入力し、入力された２個の
アドレスを加算した結果得られるアドレスによって、命
令を読み出す命令記憶装置２３０を用いる。これによ
り、従来の技術で述べたパイプラインと同等のパイプラ
インが図２のようになる。すなわち、第１の実施例同様
に、パイプラインステージがＩＤ，ＥＸ，ＷＢの３ステ
ージから構成される。Next, a second embodiment will be described with reference to FIG. Here, like the memory used in the first embodiment, an instruction storage device that inputs two addresses B and X and reads an instruction by the address obtained as a result of adding the two input addresses 230 is used. As a result, a pipeline equivalent to the pipeline described in the related art becomes as shown in FIG. That is, as in the first embodiment, the pipeline stage is composed of three stages of ID, EX and WB.

【００４２】以下では、本実施例の特長である分岐命令
での実行を説明する。The execution by a branch instruction, which is a feature of this embodiment, will be described below.

【００４３】ＩＤステージでは、分岐命令の内容の示す
レジスタ番号により、レジスタ110からレジスタの内容
（データ）が読み出され、レジスタ２０５にセットされ
る。At the ID stage, the register contents (data) are read from the register 110 and set in the register 205 according to the register number indicated by the contents of the branch instruction.

【００４４】ＥＸステージでは、レジスタ２０５にセッ
トされた２個のデータは、メモリからターゲット命令を
読み出すアドレスとして用いられ、命令記憶装置２３０
からはターゲット命令が読み出され、読み出された命令
はレジスタ１０５に書き込まれ、次の実行される命令と
なる。よって、ターゲット命令をアクセスするのに要す
るパイプラインステージを、アドレス加算の１ステージ
の削減により短くすることができる。In the EX stage, the two pieces of data set in the register 205 are used as addresses for reading the target instruction from the memory, and the instruction storage device 230 is used.
The target instruction is read from, and the read instruction is written in the register 105 and becomes the next executed instruction. Therefore, the pipeline stage required to access the target instruction can be shortened by eliminating one stage of address addition.

【００４５】図２に示したパイプラインの動作の例とし
て、図１１にタイミングチャートを示す。図１１では、
命令列として、 Add ｒ４ｒ５ｒ６(レジスタ５の内容とレジスタ６
の内容を加算し、その結果をレジスタ４に格納する） Branch ｒ７ｒ８（条件により、レジスタ７の内容と
レジスタ８の内容を加算し、その結果得られるアドレス
の指すメモリ上の位置にある命令へ分岐する） Add ｒ１ｒ２ｒ３(レジスタ２の内容とレジスタ３
の内容を加算し、その結果をレジスタ１に格納する）を実行している場合を示している。１番目の命令と２番
目の命令は、すべてのステージが１サイクルずつずれて
処理される。しかし、３番目の命令は、２番目の命令で
読み出した命令であるが、２番目の命令で命令を得られ
るのは、ＥＸステージの終了時点であるので、３番目の
命令のＩＤステージは、２番目の命令のＥＸステージの
終了を待ってから処理されなくてはならない。しかし、
従来技術と比べ、３番目の命令のＩＤステージの実行を
１サイクル早めることができる。As an example of the operation of the pipeline shown in FIG. 2, a timing chart is shown in FIG. In FIG.
As an instruction sequence, Add r4 r5 r6 (contents of register 5 and register 6
Branch r7 r8 (According to the condition, the contents of register 7 and the contents of register 8 are added, and the instruction at the memory position pointed to by the resulting address is added. Add r1 r2 r3 (register 2 contents and register 3
Is added and the result is stored in the register 1). All the stages of the first instruction and the second instruction are processed by shifting by one cycle. However, although the third instruction is the instruction read by the second instruction, the instruction can be obtained by the second instruction at the end of the EX stage, so the ID stage of the third instruction is It must be processed after waiting for the EX stage of the second instruction to finish. But,
Compared with the prior art, the execution of the ID stage of the third instruction can be advanced by one cycle.

【００４６】第１および第２の実施例では、図１及び図
２において、レジスタ１０５を図示したが、命令がセッ
トされるレジスタ、あるいはレジスタからデータを読み
出す回路、また、マルチプレクサ３３０は、本実施例で
パイプラインの構成上、便宜的に用いたものであり、本
発明を限定するものではない。本発明の特徴は、レジス
タ２０５から３０５の間にパイプラインレジスタがな
い、すなわち、アドレス加算から記憶装置へのアクセス
上にパイプラインレジスタが存在しないことである。In the first and second embodiments, although the register 105 is shown in FIGS. 1 and 2, the register in which the instruction is set, the circuit for reading the data from the register, and the multiplexer 330 are used in this embodiment. In the example, it is used for the convenience of the pipeline structure and does not limit the present invention. A feature of the present invention is that there is no pipeline register between the registers 205 and 305, that is, there is no pipeline register on the access from the address addition to the storage device.

【００４７】上記の第１および第２の実施例で用いた記
憶装置２２０及び２３０は、基本的にはアドレスを入力
すると、そのアドレスの指す部分からデータを読み出し
たり、あるいは、そのアドレスの指す部分にデータを書
き込むといったメモリである。ところで、従来技術で述
べたキャッシュ記憶装置のように、複数のメモリからな
る記憶装置に対しても、各メモリを記憶装置２２０及び
２３０と同様の記憶装置を用いることで、キャッシュ記
憶装置を構成することができ、パイプライン構成を変更
することなくこのようなキャッシュ記憶装置を第１およ
び第２の実施例での記憶装置２２０及び２３０の部分に
用いることが可能である。Basically, the storage devices 220 and 230 used in the first and second embodiments described above, when an address is input, read data from the portion indicated by the address or the portion indicated by the address. It is a memory for writing data to. By the way, even for a storage device including a plurality of memories such as the cache storage device described in the related art, a storage device similar to the storage devices 220 and 230 is used for each memory to configure the cache storage device. It is possible to use such a cache memory device for the memory devices 220 and 230 in the first and second embodiments without changing the pipeline configuration.

【００４８】また、別の考え方として、キャッシュ記憶
装置内のいくつかのメモリの内のすべてを本発明の記憶
装置に置き換えるのでなく、いくつかを置き換えるとい
う考え方がある。その一例を第３の実施例として以下に
示す。Another concept is to replace some of some of the memories in the cache storage device with the storage device of the present invention, but to replace some of them. An example thereof is shown below as a third embodiment.

【００４９】第３の実施例を、図１４を用いて説明す
る。ここでは、キャッシュ記憶装置内のメモリの内、デ
ィレクトリ２１３が、２個のアドレスＢ，Ｘを入力し、
入力された２個のアドレスを加算した結果得られるアド
レスによって、データを読み出す記憶装置になってい
る。これにより、従来の技術で述べた図１５のパイプラ
インと同等のパイプラインが図１４のようになる。すな
わち、パイプラインステージは、従来技術と同様に、Ｉ
Ｄ，ＥＸ，ＭＥＭ，ＷＢの４ステージから構成される。The third embodiment will be described with reference to FIG. Here, in the memory in the cache storage device, the directory 213 inputs two addresses B and X,
The storage device reads out data by the address obtained as a result of adding the two input addresses. As a result, a pipeline equivalent to the pipeline of FIG. 15 described in the related art becomes as shown in FIG. That is, the pipeline stage has the same I
It consists of four stages: D, EX, MEM, and WB.

【００５０】ＩＤステージでは、レジスタ１０６には命
令がセットされており、命令の内容の示すレジスタ番号
により、レジスタ１１０からレジスタの内容（データ）
が読み出され、レジスタ２０６にセットされる。At the ID stage, an instruction is set in the register 106, and the register content (data) is transferred from the register 110 according to the register number indicated by the instruction content.
Is read out and set in the register 206.

【００５１】ＥＸステージでは、レジスタ２０６にセッ
トされたデータと命令の内容に従って、ALU210では演算
が行われ、演算結果はレジスタ３０６にセットされる。
また一方では、レジスタ２０６にセットされた２個のデ
ータは、メモリにアクセスするロード命令等ではアドレ
スとして用いられ、ディレクトリ２１３からはエントリ
が読み出されレジスタ３０６にセットされる。In the EX stage, the ALU 210 performs an operation according to the data set in the register 206 and the content of the instruction, and the operation result is set in the register 306.
On the other hand, the two pieces of data set in the register 206 are used as addresses in a load instruction or the like for accessing the memory, and an entry is read from the directory 213 and set in the register 306.

【００５２】ＭＥＭステージでは、レジスタ３０６にセ
ットされたALU210の加算結果をアドレスとして用いてデ
ータアレイ３１１へアクセスし、データが読み出され、
選択回路３１２に入力し、一方、レジスタ３０６にセッ
トされたディレクトリ２１３から読み出されたエントリ
等によりヒット検出回路では、所望のデータがデータア
レイ３１１から読み出されたデータに存在するか判定さ
れ、存在する場合、選択回路３１２で所望のデータを選
択し、レジスタ４０６にセットされる。At the MEM stage, the addition result of the ALU 210 set in the register 306 is used as an address to access the data array 311, and the data is read,
On the other hand, the hit detection circuit determines whether or not the desired data exists in the data read from the data array 311 by the entry or the like input to the selection circuit 312 and read from the directory 213 set in the register 306. If it exists, the selection circuit 312 selects the desired data and sets it in the register 406.

【００５３】そして、ＷＢステージでは、レジスタ４０
６にセットされたデータはマルチプレクサ４１０でセレ
クトされてレジスタ１１０に書き込まれる。In the WB stage, the register 40
The data set in 6 is selected by the multiplexer 410 and written in the register 110.

【００５４】図１４の構成により、パイプラインステー
ジを増加させることなく、ディレクトリからの読み出し
と、ヒット検出回路を２ステージに分割することがで
き、パイプラインステージを増大させることなく高速な
マシンサイクルを実現することができ、計算機の性能を
向上させることができる。With the configuration shown in FIG. 14, the read from the directory and the hit detection circuit can be divided into two stages without increasing the pipeline stages, and a high-speed machine cycle can be achieved without increasing the pipeline stages. It can be realized and the performance of the computer can be improved.

【００５５】また、キャッシュ記憶装置内に、アドレス
変換バッファを有する計算機が多い。アドレス変換バッ
ファに、２個のアドレスＢ，Ｘを入力し、入力された２
個のアドレスを加算した結果得られるアドレスによっ
て、データを読み出す記憶装置を用いることにより、Ｅ
Ｘステージでアクセスすることを可能にできる。Further, many computers have an address translation buffer in the cache storage device. Input two addresses B and X into the address conversion buffer, and input 2
By using a storage device that reads out data according to the address obtained as a result of adding the addresses
It can be made accessible on the X stage.

【００５６】次に、データ記憶装置２２０と命令記憶装
置２３０の構成方法の一例を図３を用いて説明する。Next, an example of a method of configuring the data storage device 220 and the instruction storage device 230 will be described with reference to FIG.

【００５７】まず、読み出しの動作について説明する。
読み出しアドレスとしてアドレスＢとアドレスＸが入力
されると、その中位がともに行アドレスと列アドレスに
分けられる。従来の技術で述べたメモリと異なり、分け
たアドレスの内の、より上位の方のアドレスを行アドレ
スとし、より下位の方を列アドレスとしなくてはならな
い。二つの行アドレスは、それぞれ行デコーダ５００に
よりデコードされ、バレルシフタ５１０により、一方の
出力信号が、他方の行アドレスの示す値だけシフトされ
た信号が出力される。この出力信号は、ちょうど２個の
行アドレスを加算した後に行デコーダによりデコードし
た結果得られる出力信号と一致するものである。そし
て、この出力信号により、メモリアレイ６１０の行の
内、「行アドレスの和」の指す行の「２のＮ乗」個のメ
モリセルと、「行アドレスの和にさらに１を加えた値」
の指す行の列アドレスが０である１列の「２のＫ乗」個
のメモリセルの出力が選択され、マルチプレクサ８１０
へ出力される。First, the read operation will be described.
When the address B and the address X are input as the read address, the middle positions are divided into the row address and the column address. Different from the memory described in the conventional technique, the higher address of the divided addresses must be the row address and the lower address must be the column address. The two row addresses are respectively decoded by the row decoder 500, and the barrel shifter 510 outputs a signal obtained by shifting one output signal by the value indicated by the other row address. This output signal matches the output signal obtained as a result of decoding by the row decoder after adding exactly two row addresses. Then, by this output signal, among the rows of the memory array 610, "2 to the Nth power" of memory cells in the row indicated by "sum of row addresses" and "value obtained by adding 1 to the sum of row addresses".
The output of one column of "2 to the power of K" memory cells whose column address is 0 is selected by the multiplexer 810.
Output to

【００５８】また、一方では、キャリ先見回路７２０に
よって、アドレスＢとアドレスＸの２個のアドレスの下
位同士を加えた場合のキャリ（列アドレスの最下位ビッ
トへのキャリ）が生成され、このキャリは加算器７１０
に入力される。On the other hand, the carry look-ahead circuit 720 generates a carry (carry to the least significant bit of the column address) when the lower order of the two addresses B and X is added. Is the adder 710
Is input to

【００５９】加算器７１０では、アドレスＢとアドレス
Ｘの列アドレス同士とキャリ先見回路７２０から入力さ
れるキャリが加えられる。したがって、加算器７１０
は、アドレスＢとアドレスＸの列アドレスから下位の部
分を加えた結果得られる列アドレスとキャリ（行アドレ
スの最下位ビットへのキャリ）を出力する。そして、加
算器７１０の出力の内のキャリは、マルチプレクサ８１
０に入力され、メモリアレイからの入力信号の内、「行
アドレスの和」の指す行の「２のＮ乗」個の内の先頭の
１列の「２のＫ乗」個のメモリセルの出力信号と、「行
アドレスの和にさらに１を加えたアドレス」の指す行の
列アドレスが０である１列の「２のＫ乗」個のメモリセ
ルの出力信号との選択が行われる。すなわち、キャリが
０の場合には前者を選択し、キャリが１の場合には後者
を選択する。そして、「行アドレスの和」の指す行の
「２のＮ乗」個の内の先頭の１列の「２のＫ乗」個のメ
モリセルの出力信号以外の出力信号に関してはマルチプ
レクサ８１０をそのまま通り抜け出力される。In the adder 710, the column addresses of the addresses B and X and the carry input from the carry look-ahead circuit 720 are added. Therefore, the adder 710
Outputs the column address and the carry (carry to the least significant bit of the row address) obtained as a result of adding the lower part from the column addresses of the address B and the address X. The carry in the output of the adder 710 is the multiplexer 81.
Among the input signals from the memory array, which are input to 0, of the “2 N powers” of the first column of the “2 N powers” of the row indicated by the “row address sum”, A selection is made between the output signal and the output signals of "2 K power" memory cells in one column in which the column address of the row indicated by "the address obtained by adding 1 to the sum of row addresses" is 0. That is, when the carry is 0, the former is selected, and when the carry is 1, the latter is selected. Then, for output signals other than the output signals of the “2 K power” memory cells in the first column of the “2 N power” of the row indicated by the “row address sum”, the multiplexer 810 is left as it is. It is output through.

【００６０】こうしてマルチプレクサ８１０は「２のＮ
乗」本の信号を出力する。そして、加算器７１０の出力
信号は列デコーダ７００でデコードされ、マルチプレク
サ８００ではマルチプレクサ８１０の出力信号の「２の
Ｎ乗」本の内、対応する１列の「２のＫ乗」個の信号が
選ばれ、これが読み出しデータとして出力される。Thus, the multiplexer 810 outputs "N of 2".
It outputs the signal of the "square" book. Then, the output signal of the adder 710 is decoded by the column decoder 700, and in the multiplexer 800, among the “2 N powers” of the output signal of the multiplexer 810, the corresponding “2 K power” signals of one column are output. It is selected and this is output as read data.

【００６１】書き込み動作について説明する。読み出し
と異なるのは、マルチプレクサ800,８１０及びメモリア
レイ６１０に関する部分である。すなわち、書き込みデ
ータはマルチプレクサ８００に入力され、マルチプレク
サ８００はマルチプレクサ８１０への「２のＮ乗」の出
力信号の内、列デコーダの出力信号の内１となっている
信号に対応する１列の「２のＫ乗」の出力信号に書き込
みデータを乗せ、それ以外をハイインピーダンスにす
る。そして、マルチプレクサ８１０からメモリアレイ６
１０への出力信号「２のＮ乗＋２のＫ乗」の内、書き込
みアドレスの指定するメモリセルへ入力する信号線のみ
に、書き込みデータが乗せられ、それ以外の信号線はハ
イインピーダンスになる。The write operation will be described. What is different from reading is a portion relating to the multiplexers 800 and 810 and the memory array 610. That is, the write data is input to the multiplexer 800, and the multiplexer 800 outputs “1 to the Nth power” of the output signals to the multiplexer 810, which corresponds to one of the output signals of the column decoder. Write data is put on the output signal of "2 to the power of K", and the other parts are set to high impedance. Then, from the multiplexer 810 to the memory array 6
Of the output signal “2 to the Nth power + 2 to the Kth power” to 10, the write data is loaded only on the signal line input to the memory cell designated by the write address, and the other signal lines become high impedance.

【００６２】書き込み動作で問題となるのが、マルチプ
レクサ８１０からメモリアレイ610への出力信号の遷移
時間である。この遷移時間は、列デコーダのディレイと
加算器７１０とキャリ先見回路７２０の出力信号のディ
レイに起因するが、列アドレスと下位アドレスのビット
数が少ない場合は問題とはならない。しかし、ビット数
が多く、遷移時間が問題となる場合には、従来の技術で
も述べた方法と同様に、加算器７１０とキャリ先見回路
７２０のディレイ程度の時間中はマルチプレクサ８１０
からメモリアレイ６１０への出力信号をすべてハイイン
ピーダンスにすればよい。A problem in the write operation is the transition time of the output signal from the multiplexer 810 to the memory array 610. This transition time is caused by the delay of the column decoder and the delay of the output signals of the adder 710 and the carry look-ahead circuit 720, but it does not cause a problem when the number of bits of the column address and the lower address is small. However, when the number of bits is large and the transition time becomes a problem, the multiplexer 810 is used during the time of the delay of the adder 710 and the carry look-ahead circuit 720, as in the method described in the related art.
From the memory array 610 to the high impedance.

【００６３】ところで、図３の例では、バレルシフタを
用いたのであるが、バレルシフタは、行デコーダの出力
信号線の本数が多いと、ＬＳＩ上に構成した場合に面積
が大きくなる可能性がある。By the way, although the barrel shifter is used in the example of FIG. 3, if the number of output signal lines of the row decoder is large, the barrel shifter may have a large area when configured on an LSI.

【００６４】その解決の手段としては、図４に示すよう
に、行アドレスをあらかじめ加算してからデコードする
方法を考えることができる。これにより、面積を削減す
ることができる（ディレイは悪化するかもしれない）。
この構成においても、行アドレスの加算は、行アドレス
より下位のビットからのキャリの伝搬を待たずしてメモ
リアレイにアクセスできるので、アドレス加算よりもデ
ィレイ上は高速になり、また、遷移時間の制約を解消す
ることができる。As a means for solving the problem, it is possible to consider a method of adding row addresses in advance and then decoding, as shown in FIG. This can reduce the area (delay may be worse).
Also in this configuration, since the addition of the row address can access the memory array without waiting for the carry from the lower bit of the row address, the delay is faster than the address addition and the transition time The constraint can be removed.

【００６５】また、図４の構成を元に以下のようにする
こともできる。すなわち、図１や図２のパイプラインに
おいて、レジスタ２００にはアドレスＸとＢをセットし
ているが、レジスタ２００にセットする前に、加算器７
１１を用いて行アドレスの加算を行い（つまり部分的に
アドレス加算を行う）、その結果をレジスタ２００にセ
ットし、次のサイクルでは、レジスタ２００の出力を直
接行デコーダ５００に入力する。Further, based on the configuration of FIG. 4, the following can be done. That is, in the pipeline of FIG. 1 and FIG. 2, the addresses X and B are set in the register 200, but before being set in the register 200, the adder 7
11, the row address is added (that is, the address is partially added), the result is set in the register 200, and the output of the register 200 is directly input to the row decoder 500 in the next cycle.

【００６６】ここで重要なことは、このように、アドレ
ス加算を部分的に行ってから記憶装置にアクセスする場
合というのは、本発明の特徴である２個のアドレスでア
クセスする記憶装置において部分的に値が０に固定して
いる（値が固定であるから入力信号等付随する回路を取
り除くことができる）というある特殊な場合とみなすこ
とができることである。つまり、本発明は、アドレス加
算からメモリアクセスを行う際に１サイクルにするとい
ったこと以外に、従来のように、アドレス加算とメモリ
アクセスとステージを区切るのでなく、アドレス加算の
途中でディレイの制約に応じて適切なところでステージ
を区切ることを可能にすることを特徴とする。What is important here is that when the storage device is accessed after the address addition is partially performed in this way, the storage device accessed by two addresses, which is a feature of the present invention, is partially used. That is, it can be regarded as a special case in which the value is fixed to 0 (the accompanying circuit such as an input signal can be removed because the value is fixed). In other words, the present invention does not separate the stages of address addition and memory access from the stage as in the past, except that one cycle is used for memory access from address addition. It is characterized in that it is possible to divide the stage at an appropriate place.

【００６７】次に、メモリアレイ６１０及びマルチプレ
クサ８１０の構成方法の一例を図５を用いて説明する。
メモリアレイ６１０は、１ビットの信号を記憶するメモ
リセル６１１とスイッチ６１２からなる。スイッチ６１
２は、メモリセル６１１と、「メモリアレイ６１０とマ
ルチプレクサ８１０とを接続する信号」（ここでは「ビ
ット線」と呼ぶ）との間の接続のオン／オフを行う。ス
イッチ６１２のオン／オフは、「メモリアレイ６１０と
行デコーダ５００とを接続する信号」（ここでは「ワー
ド線」と呼ぶ）により制御され、信号が１の場合オン、
０の場合オフとなる。各ワード線に付与した数は行アド
レスであり、各メモリセルに付与した数はアドレス（こ
こでは、「行アドレス×８＋列アドレス」をアドレスと
する）である。また、マルチプレクサ８１０と８００と
の間のビット線に付与した数が列アドレスであり、メモ
リアレイ６１０とマルチプレクサ８１０との間では、列
アドレス０番に対応するビット線が２本あるので、それ
ぞれ区別するために０−１，０−２とする。Next, an example of a method of constructing the memory array 610 and the multiplexer 810 will be described with reference to FIG.
The memory array 610 includes a memory cell 611 that stores a 1-bit signal and a switch 612. Switch 61
Reference numeral 2 turns on / off the connection between the memory cell 611 and the “signal connecting the memory array 610 and the multiplexer 810” (herein referred to as “bit line”). ON / OFF of the switch 612 is controlled by a “signal connecting the memory array 610 and the row decoder 500” (herein referred to as a “word line”), and is ON when the signal is 1,
If 0, it is turned off. The number given to each word line is a row address, and the number given to each memory cell is an address (here, “row address × 8 + column address” is an address). Further, the number given to the bit line between the multiplexers 810 and 800 is the column address, and between the memory array 610 and the multiplexer 810, there are two bit lines corresponding to the column address 0. In order to do so, it is set to 0-1, 0-2.

【００６８】このメモリアレイ６１０及びマルチプレク
サ８１０の構成の特徴は、列アドレス０の部分である。
例えば、読み出し動作において、ワード線０番に乗って
いる信号が１であるとすると、メモリアレイ６１０から
マルチプレクサ８１０へのビット線上には、図中の左か
ら、それぞれメモリセル０番，８番，１番，２番，３番
…７番が読み出される。そして、マルチプレクサ８１０
で、キャリが０の場合は、メモリセル０番の方が、キャ
リが１の場合は、メモリセル８番の方が、ビット線０に
出力される。ビット線１〜７は図に示すようにマルチプ
レクサ８１０を通り抜けるのみである。The feature of the configuration of the memory array 610 and the multiplexer 810 is the column address 0 portion.
For example, in the read operation, if the signal on the word line 0 is 1, then on the bit line from the memory array 610 to the multiplexer 810, from the left in the figure, memory cells 0, 8 and Numbers 1, 2, 3, ... 7 are read. Then, the multiplexer 810
When the carry is 0, the memory cell number 0 is output to the bit line 0. When the carry is 1, the memory cell number 8 is output to the bit line 0. Bit lines 1-7 only pass through multiplexer 810 as shown.

【００６９】ここで、問題になるのが、メモリセル０番
とビット線０−１とを接続する線が長くなっていること
である。これにより、アクセス時間が増大する可能性が
ある。これを解決する一つの方法を、図６を用いて説明
する。図６で、図５と異なるのは、メモリアレイ６１０
内の「行」の配置方法である。すわなち、図の下から、
行アドレスを１行おきに配置し、行アドレスの０から最
大値の範囲において、最大値の２分の１のアドレス当た
りで、メモリアレイ６１０の上端に達するので、今度は
上端から空いている行に配置していく。こうすること
で、列アドレスとビット線を接続する線は、高々１行分
横切る程度とすることができる。Here, a problem is that the line connecting the memory cell No. 0 and the bit line 0-1 is long. This can increase access time. One method for solving this will be described with reference to FIG. 6 is different from FIG. 5 in that the memory array 610 is different.
This is the method of arranging the "rows" in That is, from the bottom of the figure,
The row address is arranged every other row, and in the range of 0 to the maximum value of the row address, the upper end of the memory array 610 is reached per address of half the maximum value. To place. By doing so, the line connecting the column address and the bit line can be crossed for at most one row.

【００７０】また、行アドレスと列アドレスは各々上位
アドレスと下位アドレスに対応するが、図５や図６に示
したメモリアレイでは、アドレスは１ビット単位に対応
づけられている。一方、一般的な今日の計算機では、ア
ドレスはバイト単位に対応づけられている。こういった
バイト単位にアドレスづけされたメモリアレイは、図５
あるいは図６に示したメモリアレイを８個用いることで
実現できることが自明である。この時、ワード線を共通
化することができる。The row address and the column address correspond to the upper address and the lower address, respectively, but in the memory array shown in FIGS. 5 and 6, the address is associated in 1-bit units. On the other hand, in a general computer today, an address is associated with a byte unit. Such a byte-addressed memory array is shown in FIG.
Alternatively, it is obvious that it can be realized by using eight memory arrays shown in FIG. At this time, the word lines can be shared.

【００７１】また、図５及び図６に示したメモリアレイ
６１０とマルチプレクサ８１０の用途としては、図３あ
るいは図４に示した記憶装置に限定されるものではな
い。図５及び図６に示したメモリアレイ６１０とマルチ
プレクサ８１０は、アドレス加算から記憶装置へのアク
セスにおいて、キャリの生成のディレイを改善するとこ
ろを特徴とし、図３及び図４に示されるようなアドレス
加算から記憶装置へのアクセスのルート上にレジスタが
存在しないことに限定されるものではないからである。The applications of the memory array 610 and the multiplexer 810 shown in FIGS. 5 and 6 are not limited to the storage device shown in FIG. 3 or 4. The memory array 610 and the multiplexer 810 shown in FIGS. 5 and 6 are characterized in that the carry generation delay is improved in the access from the address addition to the storage device, and the address as shown in FIGS. This is because it is not limited to the fact that there is no register on the route of access from the addition to the storage device.

【００７２】また、本発明では、列アドレスが０である
メモリセルへは、行アドレスに対応するメモリセルと行
アドレスに１を加算した結果得られる行アドレスに対応
するメモリセルについて、各々がビット線に接続される
ことにより、行アドレスの活性化と、アドレス加算のキ
ャリの生成を並列に行うことができる。これにより、ア
ドレス加算から記憶装置へのアクセスにおいて、キャリ
の生成のディレイを改善することが可能となる。したが
って、図５および図６に示したように、列アドレス０に
対応するメモリセルが２本のビット線にスイッチを介し
て接続されるといった構成以外に、例えば、２本のビッ
ト線各々に対しメモリセルを設ける（２重化する）とい
った構成も考えることができ、本発明は図５および図６
に示した構成に限定されない。Further, according to the present invention, for a memory cell having a column address of 0, the memory cell corresponding to the row address and the memory cell corresponding to the row address obtained as a result of adding 1 to the row address are respectively bit-wise. By connecting to a line, activation of a row address and generation of a carry for address addition can be performed in parallel. This makes it possible to improve the delay in the generation of the carry in the access to the storage device from the address addition. Therefore, as shown in FIGS. 5 and 6, in addition to the configuration in which the memory cell corresponding to the column address 0 is connected to the two bit lines via the switch, for example, for each of the two bit lines, A configuration in which a memory cell is provided (doubled) can also be considered, and the present invention is applied to FIGS.
The configuration is not limited to that shown in FIG.

【００７３】[0073]

【発明の効果】本発明によれば、パイプライン構成を改
善し、演算命令の結果の得られるステージと，ロード命
令の結果の得られるステージを同一にし、ロード命令に
よりメモリから読み出したデータを、その命令の直後の
命令で必要とする場合にも、パイプラインストールが起
こされる必要をなくすことが可能となる。また、分岐命
令を実行する際に、ターゲット命令を読み出してくるま
でに要するサイクル数を削減することができる。これら
により計算機システムの性能を向上させることができ
る。According to the present invention, the pipeline structure is improved so that the stage where the result of the operation instruction is obtained and the stage where the result of the load instruction is obtained are the same, and the data read from the memory by the load instruction is It is possible to eliminate the need for pipeline stalls to occur if the instruction immediately following that instruction requires it. Further, it is possible to reduce the number of cycles required to read the target instruction when executing the branch instruction. These can improve the performance of the computer system.

【００７４】また、アドレス加算からメモリアクセスの
ディレイを高速化することができる。Further, the delay of the memory access from the address addition can be speeded up.

[Brief description of drawings]

【図１】本発明の第１の実施例の構成を示した説明図。FIG. 1 is an explanatory diagram showing a configuration of a first embodiment of the present invention.

【図２】本発明の第２の実施例の構成を示した説明図。FIG. 2 is an explanatory diagram showing a configuration of a second exemplary embodiment of the present invention.

【図３】本発明の実施例で用いる記憶装置の構成の一例
を示したブロック図。FIG. 3 is a block diagram showing an example of a configuration of a storage device used in an embodiment of the present invention.

【図４】本発明の実施例で用いる記憶装置の構成の別の
一例を示したブロック図。FIG. 4 is a block diagram showing another example of the configuration of the storage device used in the embodiment of the present invention.

【図５】本発明の実施例で用いるメモリアレイ及びマル
チプレクサの構成の一例を示した説明図。FIG. 5 is an explanatory diagram showing an example of a configuration of a memory array and a multiplexer used in an embodiment of the present invention.

【図６】本発明の実施例で用いるメモリアレイ及びマル
チプレクサの構成の別の一例を示した説明図。FIG. 6 is an explanatory diagram showing another example of the configurations of the memory array and the multiplexer used in the embodiment of the present invention.

【図７】従来の技術における演算命令及びロード命令の
実行に関するパイプラインの説明図。FIG. 7 is an explanatory diagram of a pipeline relating to execution of an arithmetic instruction and a load instruction in the conventional technique.

【図８】従来の技術における分岐命令におけるターゲッ
ト命令の読み出しに関するパイプラインの説明図。FIG. 8 is an explanatory diagram of a pipeline regarding reading of a target instruction in a branch instruction in the conventional technique.

【図９】従来の技術におけるメモリのブロック図。FIG. 9 is a block diagram of a memory according to a conventional technique.

【図１０】本発明のロード命令を含む命令列を実行した
場合のタイミングチャート。FIG. 10 is a timing chart when an instruction sequence including a load instruction of the present invention is executed.

【図１１】本発明の分岐命令を含む命令列を実行した場
合のタイミングチャート。FIG. 11 is a timing chart when an instruction sequence including a branch instruction of the present invention is executed.

【図１２】従来技術のロード命令を含む命令列を実行し
た場合のタイミングチャート。FIG. 12 is a timing chart when an instruction sequence including a load instruction according to the related art is executed.

【図１３】従来技術の分岐命令を含む命令列を実行した
場合のタイミングチャート。FIG. 13 is a timing chart when an instruction sequence including a branch instruction of the related art is executed.

【図１４】本発明の第３の実施例の説明図。FIG. 14 is an explanatory diagram of the third embodiment of the present invention.

【図１５】従来技術におけるキャッシュ記憶装置を用い
た場合のパイプラインの説明図。FIG. 15 is an explanatory diagram of a pipeline when a cache storage device according to the related art is used.

[Explanation of symbols]

３３０，４１０…マルチプレクサ、６１１…メモリセ
ル、６１２…スイッチ、６１３…インバータ。330, 410 ... Multiplexer, 611 ... Memory cell, 612 ... Switch, 613 ... Inverter.

Claims

[Claims]

1. According to a result of decoding an instruction to be executed,
In a pipeline computer that accesses a data storage device by using a value obtained as a result of adding two data as an address, a value obtained as a result of inputting the two data output from the pipeline register and adding them A computer having a data storage device which is characterized in that there is no pipeline register on the route from accessing as an address to reading or writing data.

2. Depending on the result of decoding the instruction to be executed,
In a pipeline computer that accesses an instruction storage device by using a value obtained as a result of adding two data as an address, input a plurality of data output from a pipeline register, and give an address to a value obtained as a result of adding them. A computer having an instruction storage device, which is characterized in that there is no pipeline register on the route from which access is made as described above to read or write an instruction.

3. The register file according to claim 1,
A first pipeline register in which an instruction to be executed is set, and a second pipe in which the data read from the register file is set according to the result of decoding the output data of the first pipeline register. Input two data of the output of the second pipeline register of a pipeline computer having a line register and a third pipeline register in which the data read from the data storage device is set, A computer having a data storage device that does not have a pipeline register on the route from the value obtained as a result of adding as an address to access and reading or writing of data.

4. The register file according to claim 2,
A first pipeline register in which an instruction to be executed is set, and a second pipe in which the data read from the register file is set according to the result of decoding the output data of the first pipeline register. A pipeline register in which an instruction read from an instruction storage device is set in the first pipeline register, and input two data of the output of the second pipeline register. , A computer having an instruction storage device that has no pipeline register on the route from accessing the value obtained by adding them as an address and reading or writing the instruction.

5. The register file according to claim 1,
The data read from the register file is set in accordance with the adder, the first pipeline register in which the instruction to be executed is set, and the result of decoding the output data of the first pipeline register. Data in a second pipeline register, a third pipeline register that sets an output result of the adder that performs an operation using the data set in the second pipeline register as an operand, and data in a data cache storage device A fourth pipeline register in which data read from the array is set, wherein the data cache storage device stores the data, and the data array stores each of the data stored in the data array. Read from the directory that stores the entry containing the address information and the directory Of a pipeline calculator having a hit detection circuit that detects that desired data exists in the data read from the data array from the entry, and a selection circuit that selects the desired data from the output data of the data array, There is no pipeline register on the route from the input of the two data of the output of the second pipeline register, the access of the value obtained as a result of adding them, and the reading or writing of the data. calculator.

6. The register file according to claim 2,
The data read from the register file is set in accordance with the adder, the first pipeline register in which the instruction to be executed is set, and the result of decoding the output data of the first pipeline register. A second pipeline register; and a third pipeline register for setting an output result of the adder that performs an operation with the data set in the second pipeline register as an operand, and the first pipeline register An instruction read from the instruction cache storage device is set in the pipeline register, and the instruction cache storage device includes an instruction array for storing the instruction and an entry including address information of each of the instructions stored in the instruction array. The desired command from the directory storing the Of the output of the second pipeline register of the pipeline computer having a hit detection circuit for detecting the presence in the instruction read from the ray and a selection circuit for selecting a desired instruction from the output instruction of the instruction array. A computer that does not have a pipeline register on the route from inputting two data, accessing the value obtained as a result of adding them as an address, and reading or writing an instruction.

7. Depending on the result of decoding the instruction to be executed,
A pipeline computer in which a data storage device is accessed with a value obtained as a result of addition of three or more pieces of data as an address and three or more pieces of addition data are converted into two pieces of addition data using a carry save adder Data storage characterized by the fact that there is no pipeline register on the route from inputting the two data of the output of the register and accessing the value obtained as the result of adding them as an address and reading or writing the data A calculator with a device.

8. Depending on the result of decoding the instruction to be executed,
A pipeline computer in which a value obtained as a result of addition of three or more pieces of data is used as an address to access an instruction storage device and three or more pieces of added data are converted into two pieces of added data by using a carry save adder. Instruction storage characterized by the fact that there is no pipeline register on the route from inputting two data of register outputs, accessing the value obtained as a result of adding them as an address, and reading or writing an instruction A calculator with a device.

9. A word line to which an address is input, a memory cell which is associated with each address and stores 1-bit information, a switch controlled by the word line, and a memory cell and the switch. A second bit line controlled by the word line, and a memory cell corresponding to the address when the address is input by the word line. A first bit line is connected, a memory cell corresponding to an address obtained as a result of adding 1 to the address input from the address input line, and a memory array connected to the second bit line; A bit line, and a multiplexer connecting any one of the first bit line and the second bit line to the third bit line. That computer.

10. The address input signal line according to claim 1, 2, 7, 8 or 9, to which two addresses are input from the outside of the storage device, and the upper half of the middle of the two addresses. A first row decoder and a second row decoder for decoding, a barrel shifter in which the output signal of the first row decoder is shifted by the output signal of the second row decoder, and a word to which the output signal of the barrel shifter is input. Line and a memory array having first and second bit lines for inputting and outputting data, a carry lookahead circuit for generating a carry when the lower order of two addresses are added, and an output of the carry lookahead circuit And an adder for adding the lower half of the middle two of the addresses, and one of the first bit line and the second bit line of the memory array and the second bit line depending on the carry of the output of the adder. The first multiplexer for controlling the connection between the multiplexer and the third bit line for inputting / outputting data, the column decoder for decoding the output signal of the adder, and the input / output of data with the outside of the storage device are performed. A computer having a fourth bit line and having a second multiplexer for controlling connection between the third bit line and the fourth bit line by an output signal of the column decoder.

11. The address input line according to claim 1, 2, 7, 8 or 9, to which two addresses are input from the outside of the storage device, and the upper half of the middle of the two addresses are added. A first adder, a row decoder that decodes the output signal of the first adder, and a word line to which the output signal of the row decoder is input and first and second bit lines that input / output data. And a carry look-ahead circuit for generating a carry when the lower order of the two addresses are added, an output of the carry look-ahead circuit, and a lower order of the middle two addresses. A second adder, and a third input / output unit for inputting / outputting data to / from either the first bit line or the second bit line of the memory array or the second multiplexer by the carry of the output of the second adder. Controls connection with bit line The third multiplexer, the column decoder that decodes the output signal of the adder, and the fourth bit line that inputs and outputs data to and from the outside of the storage device. A second multiplexer that controls the connection with the four bit lines by the output signal of the column decoder.

12. A word line to which a decoded signal of a row address is input, a plurality of memory cells for storing a 1-bit signal, and a first bit line for inputting / outputting data. , A switch for controlling connection between the memory cell and an input / output signal by the word line, and a plurality of the memory cells are assembled to form a row memory cell, and each of the row memory cells has a row. An address is associated with each memory cell of each row memory cell, and each memory cell of each row memory cell is associated with a column address, and the first bit lines are the same in number as the number of memory cells in one row memory cell. Signal lines of all row memory cells and corresponding to the same column address of all row memory cells are respectively connected through the switches, and one of the word lines is activated to activate the memory cells. The switches connected to all the memory cells of one row memory cell of the address corresponding to the word line are turned on, and one second memory cell is connected to each of the first bit lines. Of one row memory cell corresponding to the address obtained by adding 1 to the address corresponding to the activated word line, and is connected to the memory cell whose column address is 0. Switch is turned on and has a memory array connected to the second bit line and a third bit line, and the first bit line and the second bit line whose column address corresponds to 0.
And a multiplexer that is connected to the third bit line and passes through the first bit line corresponding to a non-zero column address.

13. The memory array according to claim 12, wherein a memory array storing a large number of bits is composed of a large number of row memory cells, and memory cells storing a 1-bit signal are arranged in a straight line in each of the row memory cells. A group of straight lines having the same number of rows as the row memory cells in the memory array and arranged in parallel on a plane, one for each straight line.
In the memory array configured by arranging the row memory cells of one book, one row of each row memory cell is arranged in a certain direction in the order of row addresses corresponding to each of the row memory cells. Every other row memory cell is arranged on a straight line every other line, and if there is no straight line in a fixed direction, the row memory cells are not arranged yet in a direction obtained by rotating the fixed direction by 180 degrees, that is, in the reverse direction. If they are arranged on a straight line, and if there is no straight line in the inversion direction, then the row memory cells are arranged on the straight line which has not been arranged yet in the direction obtained by rotating the inversion direction by 180 degrees, that is, the original fixed direction. By doing so, between the row memory cell corresponding to the minimum row address and the row memory cell corresponding to the maximum row address,
Only one row memory cell is arranged at most, and only one row memory cell is arranged between two row memory cells corresponding to two consecutive row addresses. A computer having a memory array.

14. A computer having a memory array configured by using a plurality of the memory arrays according to claim 12 or 13.