JP2004054419A

JP2004054419A - Inter-node transaction processor

Info

Publication number: JP2004054419A
Application number: JP2002208475A
Authority: JP
Inventors: Hiroyuki Yuri; 由利　裕行
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2002-07-17
Filing date: 2002-07-17
Publication date: 2004-02-19

Abstract

<P>PROBLEM TO BE SOLVED: To transmit a transaction without data regardless of priority degrees for inter-node transmission of the transaction with data, and also to suppress the increase of a latency in the transmission of the transaction with data. <P>SOLUTION: The kind of the transaction for transmission between the nodes 101 and 102 is detected 117. Thus, the transaction without a data cycle is transmitted by using one channel 115. For the transaction with the data cycle, the data is divided into a plurality of sub-data packets, and each sub-data packet is transmitted by using the plurality of channels 115. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明はコンピュータシステムに関し、特に、処理効率を考慮したノード間トランザクション処理に関する。
【０００２】
【従来の技術】
近年、主にマルチメディアデータ転送を目的としたインタフェースとして、高速データ転送、リアルタイム転送を実現するシリアルバスが規格化されている。通常これはネットワーク通信分野で広く利用されているが、コンピュータシステムの分野においてもプロセッサバスを内在する複数のノード間を結合する手段としてシリアルバスが利用可能である。特に、ラックマウントを目的とする薄型サーバでは前記プロセッサバスを含むノード単位に基板が構成され、前記ノード間をコネクタで接続して高Ｗａｙ数サーバを実現する上でシリアルバスは有効な手段である。
【０００３】
しかしノード内のプロセッサバスの高速性と比較するとシリアルバスの転送速度は低く、ノード間を渡る通信量が増加することによりシステム全体のスループットを落としたり、或いは転送データのサイクル数そのものが長いことによりレイテンシを増大させるなどの性能低下を引き起こす。
【０００４】
特開平１１−１７７６７号公報に記載されたシリアルインタフェース回路では、シリアルバスを介して転送するパケットを第１種パケット、第２種パケットと分類し、データ等容量の大きいものを第１種、コマンドを第２種として第２種に分類されるパケットを優先的に転送することでコマンド転送のレイテンシ増大をできるだけ抑制するノード間の転送を実現している。
【０００５】
【発明が解決しようとする課題】
プロセッサバスを内在する複数ノード間において、一般にノード間の通信は自ノードに存在しないメモリの内容をリード又はライトすることを目的に発行される。このとき発行された前記リード又はライトのリクエストが完了し、対象となるメモリからの必要データのリードが完了する或いは前記メモリへのデータライトが完了することにより、次のリクエストが発行可能となるケースも多い。
【０００６】
このようなシステムでは上記手法のようにリクエスト（コマンド）だけを優先的に送信しても結局データの送信を待たなければならずシステム全体の性能を上げるのは難しい。
【０００７】
本発明は、データを伴うリクエストトランザクション又はレスポンストランザクションのノード間の送信に際して、データの伴わないリクエストトランザクション又はレスポンストランザクションに対して優先度の区別なく送信させるとともに、データを伴うトランザクション送信におけるレイテンシ増大を抑制し、システム全体の性能を向上するトランザクション処理装置を提供することにある。
【０００８】
【課題を解決するための手段】
上記目的を達成するため、本発明のトランザクション処理装置は、ノード間の通信を要するトランザクションがリードリクエストトランザクション、ライトリクエストトランザクション、リード完了通知トランザクション、ライト完了通知トランザクションのいずれであるかを検知する手段と、検知手段による検知結果に基づき、前記トランザクションがリードリクエストトランザクションないしライト完了通知トランザクションの場合に分割によるオーバーヘッド削減のため１本の通信路により送信する手段と、前記トランザクションがライトリクエストトランザクションないしリード完了通知トランザクションの場合に前記転送するトランザクションに含まれるデータをサブデータパケットに分割し、前記サブデータパケットを各々対応する複数の通信路により送信することによってトランザクションの通信に係るレイテンシを削減する手段と、を具備したことを特徴とする。
【０００９】
【発明の実施の形態】
以下、本発明の実施形態に係るトランザクション処理装置について、図面を参照しながら詳細に説明する。図１は本発明の実施形態に係るノード間トランザクション処理装置の２ノードから成るシステムの構成例を示す図であり、図２は本発明の実施形態に係るノード間トランザクション処理装置の４ノードから成るシステムの構成例を示す図である。また、図３は本実施形態に係るノード間トランザクション処理を１本の伝送路で実現した装置の動作例を示すタイムチャートであり、図４は本実施形態に係るノード間トランザクション処理を２本の伝送路で実現した装置の動作例を示すタイムチャートである。
【００１０】
図１における２個のノード１０１，１０２から成るコンピュータシステムの例において、各ノードは同一のプロセッサバス１０３に接続される複数個のプロセッサ１０４と、メモリ１０５と、前記プロセッサないし前記メモリ等を制御するコントローラ１０６と、複数個（本例では２個）の通信インタフェース［１］１０７及び通信インタフェース［２］１０８を有し、前記ノードは同じく同一プロセッサバス１０９に接続されるプロセッサ１１０、メモリ１１１、コントローラ１１２、通信インタフェース［１］１１３、通信インタフェース［２］１１４を有する他のノードと複数本（本例では２対４本）の通信路１１５によって結合される。
【００１１】
１本の通信路は１ビットないしｎビットの幅を持つシリアルバスで構成する。前記システムにおいて２個のノード１０１，１０２は同一の構成であり、いずれのプロセッサ１０４，１１０からもリクエストトランザクションの発行が可能であるが、説明を容易にするため、ここではノード１０１のプロセッサ１０４からリクエストトランザクションが発行されるケースについて説明する。
【００１２】
ここで、本実施形態において用いられるノード間トランザクションについて説明する。ノード間トランザクションの種類としては、リクエスト（要求）トランザクションとレスポンス（応答）トランザクションがあり、リクエストトランザクションにはリードトランザクションとライトとライトトランザクションがあって、前記ライトトランザクションにはデータが付加されている。また、レスポンストランザクションにはリード完了通知トランザクションとライト完了通知トランザクションがあって、前記リード完了通知トランザクションにはデータが付加されている。すなわち、ノード間トランザクションの内で、ライトリクエストトランザクションとリード完了通知トランザクションには、データが付加されている。
【００１３】
本発明の実施形態の特徴の１つは、その詳細は後述するが、このデータの付加されたトランザクションに対して当該データをサブデータパケットに分割してそれぞれのサブデータパケットを複数のノード間通信路に分けて送信することにより（データ付加されていないトランザクションは１本の通信路で送信する）、トランザクション通信のレイテンシを削減することである。
【００１４】
再び、図１において、プロセッサ１０４から発行されたリードないしライトのリクエストトランザクションは前記トランザクションの宛先メモリアドレスによって、前記宛先メモリアドレスが前記プロセッサと同一ノード１０１内のメモリ１０５に含まれる場合、前記メモリに対し直接リクエストを渡し、前記メモリアドレスが前記プロセッサとは異なるノード１０２のメモリ１１１に含まれる場合、ノード間を接続する通信路１１５によってリクエストトランザクションを送信して宛先ノード１０２のメモリ１１１にリクエストを渡す。
【００１５】
一方、前記リクエストトランザクションを受取ったメモリ１１１は、前記リクエストトランザクションがリードリクエストの場合、前記メモリアドレスで指定されるメモリの情報を読み出し、リード完了通知トランザクションと共にリードデータを通信路１１５によって前記リクエストトランザクションを発行したプロセッサ１０４に対して送信する。また、前記リクエストトランザクションがライトリクエストの場合、前記メモリアドレスで指定されるメモリに前記プロセッサから送信されたライトデータを書き込み、ライト完了通知トランザクションを通信路１１５によって前記リクエストを発行したプロセッサ１０４に対して送信する。
【００１６】
次に、前記通信路１１５を用いた本実施形態に関する送信手法について詳細に説明する。通信路１１５を用いたノード間通信の制御は各々のノード１０１，１０２が有するコントローラ１０６，１１２によって実現する。ここでも前記コントローラはいずれの前記ノードでも同一構成であるが、説明を容易にするためリクエスト送信ノード１０１のコントローラ１０６については送信制御を実現するための回路のみ図示し、リクエスト受信ノード１０２のコントローラ１１２については受信制御を実現するための回路のみ図示している。ここで、コントローラやプロセッサを含めたノード１０１，１０２の回路構成を半導体パッケージとして一体的に形成しても良い。
【００１７】
まず、トランザクションの送信制御を実現する回路について説明する。ノード１０１のプロセッサ１０４が発行するノード１０２のメモリ１１１宛のリードないしライトリクエストトランザクションは、まずコントローラ１０６内の送信バッファ１１６に記憶される。ノード１０１内で先行するノード間トランザクション通信が終了すると、次にノード間トランザクション通信を行う対象のトランザクションを前記送信バッファより読み出す。検知回路１１７は読み出したトランザクションの種類が、リードリクエストであるかライトリクエストであるかを検知し、前記トランザクションがリードリクエストの場合、データが付加されていないのでトランザクション分割部１１８に対し「分割無」の通知を行い、また、前記トランザクションがライトリクエストの場合、データが付加されているので（付加されたデータをサブデータパケットに分割する必要があるので）前記トランザクション分割部１１８に対し「分割有」の通知を行う。
【００１８】
前記トランザクション分割部１１８では前記検知回路１１７より「分割無」の通知を受取ると、送信バッファ１１６より読み出したリードトランザクションのメモリアドレスに従いアドレスインタリーブでノード間転送を行う通信路１１５を１本決定し、前記通信路に対応する通信インタフェース［１］１０７又は通信インタフェース［２］１０８を用いて前記トランザクションを送信する。すなわち、リードリクエストはデータの付加されていないトランザクションであるので、１本の通信路で送信して他の通信路は空けておく。
【００１９】
一方、前記トランザクション分割部１１８で前記検知回路１１７より「分割有」の通知を受取ると、前記トランクション分割部１１８は次に分割数判定回路１１９からの分割数の指定を待つ。前記分割数判定回路１１９では、ノード間の通信路数Ｋ（図１に示す例では送信用には２本）と送信バッファより読み出したライトトランザクションに含まれるデータサイクル数Ｍによってデータの分割数Ｎを次の通り決定する。Ｋ＞Ｍの場合分割数Ｎ＝Ｍ、Ｋ≦Ｍの場合分割数Ｎ＝Ｋとする。具体例で説明すると、通信路数Ｋが３本で、データがデータ量の少ない２サイクルとデータ量の多い１０サイクルのトランザクションがある場合に、前者のデータ分割数Ｎは２であり、後者のデータ分割数Ｎは３である。トランザクション分割部１１８は分割数判定回路１１９から分割数の指定を受取ると、送信バッファ１１６より読み出したライトトランザクションに含まれるデータを以下の要領でサブデータパケットに分割する。
【００２０】
ここで、図３を参照して、ノードのシステム内送信とノード間の伝送路送信についての情報転送を具体例で説明する。規格にしたがって例えば、システム内では１サイクルで１６バイト幅を処理でき、伝送路では４バイト幅を１サイクルとして処理できる場合、図３に示すように伝送路送信ではシステム内のヘッダ情報もデータも４つに分ける必要がある。図３に示すように、システム内で１サイクルのヘッダ情報３０１と８サイクルのデータ３０２からなるトランザクション３０３は、伝送路送信においては、１サイクルの処理バイト幅が１／４となっているので、トランザクション３０３は３６（９サイクル×４）サイクルとなっている。
【００２１】
次に、上述したサブデータパケットの具体的な分割手法を説明する。すなわち、具体例を以って云えば、ノード間の通信インタフェースが［１］、［２］、［３］と３本であって、８サイクルのデータであった場合に、各インタフェースに８サイクルデータを如何に振り分けるかがサブデータパケットの分割ということである。結論的に云えば、図３を参照すると、インタフェース［１］にはＤ１、Ｄ４、Ｄ７を、インタフェース［２］にはＤ２、Ｄ５、Ｄ８を、インタフェース［３］にはＤ３、Ｄ６を分割配分し、それぞれのサブデータパケットの先頭にヘッダ情報Ｈを付けることになる。このサブデータパケットのヘッダ情報はそれぞれが同一の内容を持っていて、システム内でのヘッダ情報と１サイクル内の処理バイト幅は異なるがパケット統合後の内容は同一のものである。
【００２２】
以上のサブデータパケットの分割を一般的概念で記述すると、以下の通りとなる。前記ライトトランザクションはメモリアドレスを含む１サイクルのリクエストヘッダ情報とＭサイクルのデータから成るものとする。このとき前記リクエストヘッダサイクルＨは分割するＮ個全てのサブデータパケットの先頭に置き、前記ＭサイクルのデータＤ（１）、Ｄ（２）、…、Ｄ（Ｍ）は、サブデータパケット１に対しｄ（ｙ）＝Ｄ（ｘＮ＋１）、サブデータパケット２に対しｄ（ｙ）＝Ｄ（ｘＮ＋２）、…、サブデータパケットＮに対しｄ（ｙ）＝Ｄ（ｘＮ＋Ｎ）を置くよう分割を行う（ここでｘは０以上の整数、ｙは各々のサブデータパケットのデータに順に付した番号とする）。以上のようにして分割を行ったサブデータパケット１からサブデータパケットＮを各々対応する通信インタフェース［１］から通信インタフェース［Ｎ］を用いて送信する。
【００２３】
次に、トランザクションの受信制御を実現する回路について説明する。受信ノード１０２では通信路１１５により送信されたトランザクションないしサブデータパケットは、送信ノード１０１の通信インタフェース［１］１０７、通信インタフェース［２］１０８に各々対応して通信インタフェース［１］１１３、通信インタフェース［２］１１４にて受信する。受信したトランザクションないしサブデータパケットは、まずヘッダ検出部１２０においてリードリクエストトランザクションであるかライトリクエストトランザクションのサブパケットであるかを判断する。前記判断結果がリードリクエストトランザクションの場合前記トランザクションはそのままトランザクション統合部１２１を通過して受信バッファ１２２に記憶される。前記受信バッファに記憶されたリクエストトランザクションは順次宛先メモリ１１１に渡される。
【００２４】
一方、前記ヘッダ検出部１２０においてライトリクエストトランザクションのサブパケットであると判断した場合で、比較回路１２３にはまだヘッダサイクルが記憶されていない場合、前記サブパケットは前記ライトリクエストトランザクションを分割した１個目のサブパケットであると判断し、前記比較回路１２３にヘッダサイクルを記憶する。また、トランザクション統合部１２１に前記受信サブパケットのヘッダサイクルを通知すると共に、前記ヘッダサイクルから元のライトトランザクションのデータサイクル数Ｍを検知し、前記トランザクション統合部から受信バッファ１２２のＭ個分データ格納領域の確保を行う。次に、前記ヘッダサイクルに続くデータｄ（ｙ）はトランザクション統合部１２１において、受信した通信インタフェース［１］から通信インタフェース［Ｎ］の番号［ｉ］に対応してデータＤ（（ｙ−１）Ｎ＋ｉ）を復元し、前記受信バッファの対応する領域に記憶する。前述の例で具体的に云えば、通信インタフェース［１］に対応してＤ１、Ｄ４、Ｄ７を復元し、このデータＤ１、Ｄ４、Ｄ７は受信バッファの８個分確保した領域の第１、第４、第７の領域に記憶されることとなる。
【００２５】
さらに、前記ヘッダ検出部１２０においてライトリクエストトランザクションのサブパケットであると判断した場合で、比較回路１２３にすでにヘッダサイクルが記憶されている場合、受信したヘッダサイクルと前記記憶されているヘッダサイクルとの比較を行う。比較の結果同一ヘッダサイクルではないと判断した場合は、新たに１個目のサブパケットを受信したと判断して上記と同じ処理を行う。また比較の結果同一ヘッダサイクルであると判断した場合トランザクション統合部１２１にはヘッダサイクルの通知を行わず、前記比較回路１２３より前記トランザクション統合部１２１に対し統合指示を通知することによってヘッダサイクルに続くデータｄ（ｙ）に対し上記と同様にデータＤ（（ｙ−１）Ｎ＋ｉ）を復元し、受信バッファ１２２の対応する領域に記憶する。
【００２６】
トランザクション統合部１２１ではまた、最初のヘッダサイクルの通知時にこれから統合を行おうとするトランザクションの分割数を検知し、比較回路１２３から通知される統合指示が前記分割数に達したことによって全分割サブデータパケットを受信したと判断し、受信バッファ１２２へのデータの記憶を完了する。前記受信バッファに記憶されたリクエストトランザクションは順次宛先メモリ１１１に渡される。
【００２７】
図２は４個のノード２０１〜２０４から成るコンピュータシステムの例を示す。各ノードの構成は図１に示すものと同一であり、図２の例は各ノードが通信インタフェース［１］から通信インタフェース［４］２０５〜２０８を有し、各々他のノードと中継装置［１］から中継装置［４］２０９〜２１２を介する４対の通信路２１３によって結合される。なお、図２では送信と受信の一対の通信路を一本の実線で示している。このようにシステムを構成するノード数を４ノード、８ノード、…、ｎノードと増やした場合でも、中継装置を介して或いは多段構成される中継装置を介してノード間の複数の通信路による結合を持つことにより、図１にて説明したトランザクション処理装置の実現が可能である。
【００２８】
以上説明したように、本発明の実施形態では、１回のノード間トランザクション転送に係るレイテンシを削減する効果が得られる。この効果について、図３及び図４を用いて説明する。
【００２９】
図３はヘッダ情報３０１に続く８サイクルのデータ３０２から成るトランザクション３０３を１個の伝送路を用いて送信するケースを示しており、伝送路のデータ幅（規格で例えば、１サイクルで４バイト）がシステム内（規格で例えば、１サイクルで１６バイト）に比べ１／４である例を挙げている。前記ケースでは伝送路に対しデータを送信するためのデータ送信時間３０４に３６サイクルを要し、次に前記伝送路に対し送信開始するまでの間の送信完了待ち遅延３０５は２７サイクルを要する（次回のトランザクションにとってのレイテンシが２７サイクル増大する）ことが判る。
【００３０】
これに対して、図４は図３と同じ能力の伝送路２本を用いてトランザクションを送信するケースを示しており、前記ケースでは伝送路に対するデータ送信時間４０１は２０サイクル、送信完了待ち遅延４０２は１１サイクルに改善されていることが判る。
【００３１】
【発明の効果】
本発明によれば、ノード間の通信を要するトランザクションに含まれるデータをサブデータパケットに分割して複数の通信路で送信することによって、トランザクションの通信に係るレイテンシを削減することができる。
【００３２】
また、ノード間で送信するトランザクションの種類を検知することによりノード間のデータ無しトランザクション転送のオーバーヘッドは増加させずに、データトランザクション転送のレイテンシを削減するノード間トランザクション処理装置の実現が可能である。
【図面の簡単な説明】
【図１】本発明の実施形態におけるノード間トランザクション処理装置の２ノードから成るシステムの構成例を示す図である。
【図２】本発明の実施形態に係るノード間トランザクション処理装置の４ノードから成るシステムの構成例を示す図である。
【図３】本実施形態に係るノード間トランザクション処理を１本の伝送路で実現した装置の動作例を示すタイムチャートである。
【図４】本実施形態に係るノード間トランザクション処理を２本の伝送路で実現した装置の動作例を示すタイムチャートである。
【符号の説明】
１０１，１０２　ノード
１０３，１０９　プロセッサバス
１０４，１１０　プロセッサ
１０５，１１１　メモリ
１０６，１１２　コントローラ
１０７，１０８，１１３，１１４　通信インタフェース
１１５　通信路
１１６　送信バッファ
１１７　トランザクション種類検知回路
１１８　トランザクション分割部
１１９　データ分割数判定回路
１２０　サブデータパケットヘッダ検出部
１２１　トランザクション結合部
１２２　受信バッファ
１２３　ヘッダサイクル比較回路
２０１〜２０４　ノード
２０５〜２０８　通信インタフェース
２０９〜２１２　中継装置
２１３　通信路
３０１　ヘッダ情報
３０２　データ
３０３　トランザクション
３０４，４０１　データ送信時間
３０５，４０２　送信完了待ち遅延[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a computer system, and more particularly, to a transaction process between nodes in consideration of processing efficiency.
[0002]
[Prior art]
In recent years, a serial bus for realizing high-speed data transfer and real-time transfer has been standardized as an interface mainly for multimedia data transfer. Normally, this is widely used in the field of network communication, but also in the field of computer systems, a serial bus can be used as a means for connecting a plurality of nodes having a processor bus therein. In particular, in a thin server intended for rack mounting, a board is configured for each node including the processor bus, and a serial bus is an effective means for realizing a high-way server by connecting the nodes with a connector. .
[0003]
However, compared to the high speed of the processor bus in the node, the transfer speed of the serial bus is low, and the communication amount between the nodes increases, so that the throughput of the entire system decreases or the cycle number of transfer data itself is long. This causes performance degradation such as an increase in latency.
[0004]
In the serial interface circuit described in Japanese Patent Application Laid-Open No. H11-17767, packets to be transferred via a serial bus are classified into a first type packet and a second type packet. Is a second type, and a packet classified into the second type is preferentially transferred, thereby realizing transfer between nodes that suppresses an increase in the latency of command transfer as much as possible.
[0005]
[Problems to be solved by the invention]
In general, communication between nodes among a plurality of nodes including a processor bus is issued for the purpose of reading or writing the contents of a memory that does not exist in the own node. At this time, the issued read or write request is completed, and when the necessary data is read from the target memory or the data write to the memory is completed, the next request can be issued. There are many.
[0006]
In such a system, even if only a request (command) is preferentially transmitted as in the above-described method, it is necessary to wait for data transmission after all, and it is difficult to improve the performance of the entire system.
[0007]
According to the present invention, when transmitting a request transaction or a response transaction involving data between nodes, a request transaction or a response transaction involving no data is transmitted without distinction of priority, and an increase in latency in transmitting a transaction involving data is suppressed. Another object of the present invention is to provide a transaction processing device that improves the performance of the entire system.
[0008]
[Means for Solving the Problems]
In order to achieve the above object, a transaction processing device according to the present invention includes a means for detecting whether a transaction requiring communication between nodes is a read request transaction, a write request transaction, a read completion notification transaction, or a write completion notification transaction. Means for transmitting data over a single communication path based on a result of detection by the detecting means in order to reduce overhead due to division when the transaction is a read request transaction or a write completion notification transaction; In the case of a transaction, data included in the transaction to be transferred is divided into sub-data packets, and Means for reducing the latency of the communication of the transaction by sending a plurality of communication paths that, characterized by comprising a.
[0009]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, a transaction processing device according to an embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 is a diagram showing a configuration example of a system including two nodes of an inter-node transaction processing device according to an embodiment of the present invention, and FIG. 2 is a diagram including four nodes of an inter-node transaction processing device according to an embodiment of the present invention. FIG. 1 is a diagram illustrating a configuration example of a system. FIG. 3 is a time chart showing an operation example of an apparatus in which the inter-node transaction processing according to the present embodiment is realized by one transmission line. FIG. 4 is a time chart showing the inter-node transaction processing according to the present embodiment. 6 is a time chart illustrating an operation example of an apparatus realized by a transmission path.
[0010]
In the example of the computer system including two nodes 101 and 102 in FIG. 1, each node controls a plurality of processors 104 connected to the same processor bus 103, a memory 105, and the processors or the memories. It has a controller 106, a plurality of (two in this example) communication interfaces [1] 107 and a communication interface [2] 108, and the nodes are also connected to the same processor bus 109 by a processor 110, a memory 111, and a controller. It is coupled to another node having the communication interface 112, the communication interface [1] 113, and the communication interface [2] 114 by a plurality of (two to four in this example) communication paths 115.
[0011]
One communication path is constituted by a serial bus having a width of 1 bit to n bits. In the above system, the two nodes 101 and 102 have the same configuration, and any of the processors 104 and 110 can issue a request transaction. A case where a request transaction is issued will be described.
[0012]
Here, an inter-node transaction used in the present embodiment will be described. There are request (request) transactions and response (response) transactions as types of transactions between nodes. Request transactions include read transactions, write transactions, and write transactions, and data is added to the write transactions. The response transaction includes a read completion notification transaction and a write completion notification transaction, and data is added to the read completion notification transaction. That is, of the transactions between nodes, data is added to the write request transaction and the read completion notification transaction.
[0013]
One of the features of the embodiment of the present invention, which will be described later in detail, is to divide the data into sub-data packets for the transaction to which the data is added, and to transmit each sub-data packet to a plurality of nodes. The transmission is divided into channels (transactions to which no data is added are transmitted through a single communication channel), thereby reducing the latency of transaction communication.
[0014]
Referring again to FIG. 1, a read or write request transaction issued from the processor 104 is stored in the memory when the destination memory address is included in the memory 105 in the same node 101 as the processor according to the destination memory address of the transaction. In the case where the request is directly passed, and the memory address is included in the memory 111 of the node 102 different from the processor, the request transaction is transmitted through the communication path 115 connecting the nodes and the request is passed to the memory 111 of the destination node 102. .
[0015]
On the other hand, when the request transaction is a read request, the memory 111 that has received the request transaction reads the information of the memory specified by the memory address, and transmits the read data along with the read completion notification transaction through the communication path 115 to the request transaction. The message is transmitted to the processor 104 that issued the message. If the request transaction is a write request, the write data transmitted from the processor is written to the memory specified by the memory address, and a write completion notification transaction is sent to the processor 104 that issued the request via the communication path 115. Send.
[0016]
Next, a transmission method according to the present embodiment using the communication path 115 will be described in detail. Control of communication between nodes using the communication path 115 is realized by controllers 106 and 112 of the nodes 101 and 102, respectively. Here, the controller has the same configuration in any of the nodes. However, for the sake of simplicity, the controller 106 of the request transmission node 101 only shows a circuit for implementing transmission control, and the controller 112 of the request reception node 102 Is shown only for the circuit for realizing the reception control. Here, the circuit configuration of the nodes 101 and 102 including the controller and the processor may be integrally formed as a semiconductor package.
[0017]
First, a circuit that implements transaction transmission control will be described. A read or write request transaction addressed to the memory 111 of the node 102 issued by the processor 104 of the node 101 is first stored in the transmission buffer 116 in the controller 106. When the preceding inter-node transaction communication is completed in the node 101, the next transaction to be subjected to inter-node transaction communication is read from the transmission buffer. The detection circuit 117 detects whether the type of the read transaction is a read request or a write request. If the transaction is a read request, the data is not added, and therefore, the transaction division unit 118 sends “no division” to the transaction division unit 118. If the transaction is a write request, since the data is added (since the added data needs to be divided into sub-data packets), the transaction division unit 118 is given a "partitioned" Notification.
[0018]
When the transaction division unit 118 receives the notification of "no division" from the detection circuit 117, it determines one communication path 115 for performing inter-node transfer by address interleaving according to the memory address of the read transaction read from the transmission buffer 116, The transaction is transmitted using the communication interface [1] 107 or the communication interface [2] 108 corresponding to the communication path. That is, since the read request is a transaction to which no data is added, the read request is transmitted through one communication path and the other communication paths are kept free.
[0019]
On the other hand, when the transaction division unit 118 receives a notification of “presence of division” from the detection circuit 117, the tranition division unit 118 waits for designation of the number of divisions from the division number determination circuit 119. In the division number determination circuit 119, the number of data divisions N is determined by the number of communication paths K between nodes (two transmission lines in the example shown in FIG. 1) and the number of data cycles M included in the write transaction read from the transmission buffer. Is determined as follows. If K> M, the number of divisions N = M, and if K ≦ M, the number of divisions N = K. Explaining in a specific example, when the number of communication paths K is three, and there are two cycles of data having a small data amount and a transaction of 10 cycles having a large data amount, the former data division number N is two and the latter is two. The data division number N is 3. When receiving the designation of the number of divisions from the division number determination circuit 119, the transaction division unit 118 divides the data included in the write transaction read from the transmission buffer 116 into sub-data packets in the following manner.
[0020]
Here, with reference to FIG. 3, a specific example will be described of information transfer regarding intra-system transmission of a node and transmission path transmission between nodes. According to the standard, for example, if a system can process a 16-byte width in one cycle and a transmission line can process a 4-byte width as one cycle, the header information and data in the system are transmitted in the transmission path as shown in FIG. It needs to be divided into four. As shown in FIG. 3, a transaction 303 including one cycle of header information 301 and eight cycles of data 302 in the system has a processing byte width of one quarter in transmission line transmission. The transaction 303 has 36 (9 cycles × 4) cycles.
[0021]
Next, a specific method of dividing the above-described sub data packet will be described. In other words, to give a specific example, if the communication interfaces between the nodes are [1], [2], and [3] and the data is 8 cycles, each interface has 8 cycles. How the data is distributed is the division of the sub data packet. In conclusion, referring to FIG. 3, D1, D4 and D7 are allocated to the interface [1], D2, D5 and D8 are allocated to the interface [2], and D3 and D6 are allocated to the interface [3]. Then, header information H is added to the head of each sub data packet. The header information of each sub data packet has the same contents, and the header information in the system differs from the header information in the processing byte width in one cycle, but the contents after packet integration are the same.
[0022]
The above-described division of the sub-data packet is described in a general concept as follows. The write transaction includes one cycle of request header information including a memory address and M cycles of data. At this time, the request header cycle H is placed at the head of all the N sub data packets to be divided, and the data D (1), D (2),. On the other hand, division is performed such that d (y) = D (xN + 1), d (y) = D (xN + 2) for sub data packet 2,... D (y) = D (xN + N) for sub data packet N. (Where x is an integer equal to or greater than 0, and y is a number sequentially assigned to the data of each sub data packet). The sub data packets 1 to N divided as described above are transmitted from the corresponding communication interface [1] using the communication interface [N].
[0023]
Next, a circuit for realizing transaction reception control will be described. In the receiving node 102, the transaction or the sub data packet transmitted by the communication path 115 corresponds to the communication interface [1] 107 and the communication interface [2] 108 of the transmission node 101, respectively. 2] Received at 114. First, the header detection unit 120 determines whether the received transaction or sub data packet is a read request transaction or a sub packet of a write request transaction. If the determination result is a read request transaction, the transaction passes through the transaction integration unit 121 and is stored in the reception buffer 122 as it is. The request transactions stored in the reception buffer are sequentially passed to the destination memory 111.
[0024]
On the other hand, if the header detection unit 120 determines that the packet is a sub-packet of a write request transaction and the comparison circuit 123 has not yet stored a header cycle, the sub-packet is one of the divided write request transactions. It is determined that the packet is an eye sub-packet, and the comparison circuit 123 stores the header cycle. Also, the header cycle of the received sub-packet is notified to the transaction integration unit 121, the number M of data cycles of the original write transaction is detected from the header cycle, and the data of M data in the reception buffer 122 is stored from the transaction integration unit. Allocate an area. Next, the data d (y) following the header cycle is transferred from the communication interface [1] to the data D ((y-1) corresponding to the number [i] of the communication interface [N] in the transaction integration unit 121. N + i) is restored and stored in the corresponding area of the reception buffer. More specifically, in the above-described example, D1, D4, and D7 are restored corresponding to the communication interface [1], and the data D1, D4, and D7 are stored in the first and second areas of the area secured for eight reception buffers. 4. It is stored in the seventh area.
[0025]
Further, when the header detection unit 120 determines that the received packet is a subpacket of a write request transaction, and when the header cycle is already stored in the comparison circuit 123, the received header cycle and the stored header cycle are compared. Make a comparison. If the result of the comparison indicates that they are not the same header cycle, it is determined that the first subpacket has been newly received, and the same processing as described above is performed. If it is determined that the cycle is the same as the header cycle as a result of the comparison, the transaction cycle is not notified to the transaction integration section 121, and the comparison circuit 123 notifies the transaction integration section 121 of the integration instruction. Data D ((y-1) N + i) is restored for data d (y) in the same manner as described above, and stored in the corresponding area of reception buffer 122.
[0026]
The transaction integration unit 121 also detects the number of divisions of the transaction to be integrated at the time of notification of the first header cycle, and determines that the integration instruction notified from the comparison circuit 123 has reached the number of divisions. It is determined that the packet has been received, and the storage of the data in the reception buffer 122 is completed. The request transactions stored in the reception buffer are sequentially passed to the destination memory 111.
[0027]
FIG. 2 shows an example of a computer system including four nodes 201 to 204. The configuration of each node is the same as that shown in FIG. 1, and in the example of FIG. 2, each node has communication interfaces [1] to [4] 205 to 208, and each node has another node and the relay device [1]. ] Are connected by four pairs of communication paths 213 via relay devices [4] 209 to 212. In FIG. 2, a pair of transmission and reception communication paths is indicated by a single solid line. Even when the number of nodes constituting the system is increased to four, eight,..., N nodes in this manner, coupling through a plurality of communication paths between nodes via a relay device or a multistage relay device. , The transaction processing apparatus described with reference to FIG. 1 can be realized.
[0028]
As described above, in the embodiment of the present invention, the effect of reducing the latency related to one-time transaction transfer between nodes can be obtained. This effect will be described with reference to FIGS.
[0029]
FIG. 3 shows a case in which a transaction 303 including eight cycles of data 302 following the header information 301 is transmitted using one transmission line, and the data width of the transmission line (for example, 4 bytes in one cycle in the standard). Is 1/4 of that in the system (for example, 16 bytes per cycle in the standard). In the above case, the data transmission time 304 for transmitting data to the transmission path requires 36 cycles, and the transmission completion waiting delay 305 until the start of transmission to the transmission path next requires 27 cycles (the next time). The latency for the transaction is increased by 27 cycles).
[0030]
On the other hand, FIG. 4 shows a case in which a transaction is transmitted using two transmission paths having the same capacity as in FIG. 3, and in this case, the data transmission time 401 for the transmission path is 20 cycles, and the transmission completion waiting delay 402 Is improved to 11 cycles.
[0031]
【The invention's effect】
According to the present invention, the latency involved in transaction communication can be reduced by dividing data included in a transaction requiring communication between nodes into sub-data packets and transmitting the sub-data packets through a plurality of communication paths.
[0032]
Further, by detecting the type of transaction transmitted between nodes, it is possible to realize an inter-node transaction processing apparatus that reduces the latency of data transaction transfer without increasing the overhead of data-less transaction transfer between nodes.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating a configuration example of a system including two nodes of an inter-node transaction processing device according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating a configuration example of a system including four nodes of an inter-node transaction processing device according to an embodiment of the present invention.
FIG. 3 is a time chart illustrating an operation example of an apparatus that implements transaction processing between nodes according to the present embodiment using one transmission line.
FIG. 4 is a time chart showing an operation example of an apparatus which realizes transaction processing between nodes according to the present embodiment using two transmission paths.
[Explanation of symbols]
101, 102 Node 103, 109 Processor Bus 104, 110 Processor 105, 111 Memory 106, 112 Controller 107, 108, 113, 114 Communication Interface 115 Communication Path 116 Transmission Buffer 117 Transaction Type Detection Circuit 118 Transaction Division Unit 119 Data Division Number Judgment Circuit 120 Sub-data packet header detecting unit 121 Transaction combining unit 122 Receive buffer 123 Header cycle comparing circuit 201 to 204 Node 205 to 208 Communication interface 209 to 212 Relay device 213 Communication path 301 Header information 302 Data 303 Transaction 304, 401 Data transmission time 305, 402 Transmission completion wait delay

Claims

An inter-node transaction processing device comprising: a plurality of nodes including at least one processor, a memory, a controller for controlling the processor and the memory; and at least two communication paths interconnecting the plurality of nodes. At
The processor issues a read or write request transaction by designating a memory address, and when the memory address is included in a memory in the same node as the processor, passes the request transaction directly to the memory;
When the memory address is included in the memory of a node different from the processor, the request transaction is transmitted to the node having the memory including the memory address by the communication path and passed to the memory of the destination node.
The memory that has received the request transaction reads information of the memory specified by the memory address when the request transaction is a read, and transmits read data together with a read completion notification transaction to the processor that issued the request transaction, If the request transaction is a write, write the write data transmitted from the processor to the memory specified by the memory address, transmit a write completion notification transaction to the processor that issued the request,
Regarding a request transaction or a completion notification transaction requiring communication between nodes, a read request transaction or a write completion notification transaction performs communication through one communication path determined based on the memory address,
The write request transaction including the write data or the read completion notification transaction including the read data is performed by dividing the data included in the transaction into a plurality of sub data packets and transmitting the sub data packets using the corresponding communication paths. An inter-node transaction processing device, wherein the controller in the receiving node restores the sub data packet to the original write request transaction or read completion notification transaction.

In claim 1,
Detecting that the transaction requiring communication between the nodes is a write request transaction or a read completion notification transaction, the number of communication paths K between the nodes and the number of data cycles included in the write request transaction or the read completion notification transaction M, the number of divisions N into the sub-data packets is determined.

In claim 1 or 2,
If the transaction is a write request transaction or a read completion notification transaction, add header information of the transaction to the head of the sub-data packet of the communication path,
The inter-node transaction processing device, wherein the added header information has the same content for each of the communication paths.

A semiconductor package including the inter-node transaction processing device according to claim 1.