JP5408713B2

JP5408713B2 - Cache memory control system and cache memory control method

Info

Publication number: JP5408713B2
Application number: JP2009223963A
Authority: JP
Inventors: 真一嶋田
Original assignee: NEC Computertechno Ltd
Current assignee: NEC Computertechno Ltd
Priority date: 2009-09-29
Filing date: 2009-09-29
Publication date: 2014-02-05
Anticipated expiration: 2029-09-29
Also published as: JP2011076159A

Description

本発明は、キャッシュメモリ制御システム及びキャッシュメモリの制御方法に関し、特に、情報処理装置におけるメモリデータアクセス競合時のリクエスト処理についてのキャッシュメモリ制御システム及びキャッシュメモリの制御方法に関する。 The present invention relates to a cache memory control system and a cache memory control method, and more particularly, to a cache memory control system and a cache memory control method for request processing when a memory data access conflict occurs in an information processing apparatus.

特許文献１には、キャッシュを備えた複数のプロセッサとメモリとの間のデータ入出力を制御するメモリコントローラに関する技術が開示されている。特許文献１にかかるメモリコントローラは、読み出し要求を受けた場合、各プロセッサから当該要求についてのスヌープ結果が通知される前に、メモリに対するデータ読み出し処理を開始し、書き込み要求を受けた場合、各プロセッサから当該要求についてのスヌープ結果が通知されるのを待って、メモリに対するデータ書き込み処理を開始する。 Patent Document 1 discloses a technique related to a memory controller that controls data input / output between a plurality of processors including a cache and a memory. When receiving a read request, the memory controller according to Patent Document 1 starts a data read process for a memory before receiving a snoop result regarding the request from each processor. Waits for the snoop result for the request to be notified, and starts data write processing to the memory.

特許文献２には、マルチプロセッサ環境において、性能を改善するように管理されたメモリキャッシュシステムに関する技術が開示されている。特許文献２にかかるメモリキャッシュシステムにおいて、第１のプロセッサは、第１のレベル１キャッシュを用いてデータにアクセスし、第２のプロセッサは、第２のレベル１キャッシュを用いてデータにアクセスする。また、記憶制御回路は、第１と第２のレベル１キャッシュとレベル２キャッシュとメインメモリとの間に配置される。レベル２キャッシュは、主記憶装置内のデータのコピーを管理し、それらのレベル１キャッシュがデータのコピーを有し、それらのコピーが変更されているか否かの指示を、さらに管理する。 Patent Document 2 discloses a technology related to a memory cache system managed to improve performance in a multiprocessor environment. In the memory cache system according to Patent Document 2, the first processor accesses data using the first level 1 cache, and the second processor accesses data using the second level 1 cache. The storage control circuit is disposed between the first and second level 1 caches, the level 2 caches, and the main memory. Level 2 caches manage copies of data in main storage, and further manage whether those level 1 caches have copies of data and whether these copies have been modified.

特許文献３には、ノードの処理状態にしたがって主記憶アクセスの先行を許しマルチプロセッサシステムの性能を向上させる技術が開示されている。特許文献３にかかるマルチプロセッサシステムは、データ読み出しアクセスを全ノードで同期して選択し順序付けするだけでなく、データ書き戻し完了通知も全ノードで同期して選択し順序付けすることで、全ノードで観測されるデータ読み出しの順序とデータ書き戻しの完了順序を一意にする。また、各ノードにおいて、順序付けされたデータ読み出しアクセスと順序付けされたデータ書き戻し完了通知の対象アドレスを比較し、データ書き戻しの完了に追い越される同一アドレスのデータ読み出しを検出することで、データ読み出しとデータ書き戻しの順序を決定する。このとき、データ書き戻しの完了に追い越された同一アドレスのデータ読み出しアクセスを送信したノードへデータの読み直しを促すコヒーレンシ応答を送信することで、データのコヒーレンシを維持する。 Japanese Patent Application Laid-Open No. 2004-228561 discloses a technique for allowing the advance of main memory access according to the processing state of a node and improving the performance of the multiprocessor system. The multiprocessor system according to Patent Document 3 not only selects and orders data read access synchronously on all nodes, but also selects and orders data writeback completion notifications synchronously on all nodes, so that The observed data reading order and data writing back completion order are made unique. Further, each node compares the target address of the ordered data read access and the ordered data write back completion notification, and detects the data read of the same address that is overtaken by the completion of the data write back. Determine the data write-back order. At this time, the coherency of the data is maintained by transmitting a coherency response that prompts the rereading of data to the node that has transmitted the data read access of the same address that has been overtaken by the completion of the data write back.

特許文献４には、ライトバック方式のキャッシュメモリを有する共有メモリ型マルチプロセッサシステムにおいて、余分なスヌープ処理をなくして高いシステム性能を実現するキャッシュメモリ制御方式に関する技術が開示されている。特許文献４にかかるキャッシュメモリ制御方式において、プロセッサは、無効状態、共有状態、排他一致状態、および排他変更状態の４状態を取る状態部を備えるコピータグを有している。キャッシュメモリが排他変更状態に遷移したときは、コピータグに通知することによってキャッシュメモリの状態と一致させ、共有バス上の共有メモリへのリード要求をバス監視回路によって検出し、コピータグの検査を行ってキャッシュメモリへのアクセスを減らし、排他変更状態のブロックにヒットしたときにだけ、変更信号線を出力して共有メモリを待機させ、キャッシュメモリに対してスヌープ処理を行う。 Patent Document 4 discloses a technique related to a cache memory control system that achieves high system performance by eliminating extra snoop processing in a shared memory multiprocessor system having a write-back cache memory. In the cache memory control method according to Patent Document 4, the processor has a copy tag including a state unit that takes four states of an invalid state, a shared state, an exclusive match state, and an exclusive change state. When the cache memory changes to the exclusive change state, it notifies the copy tag to match the state of the cache memory, detects a read request to the shared memory on the shared bus by the bus monitoring circuit, and checks the copy tag. The access to the cache memory is reduced, and only when a block in the exclusive change state is hit, the change signal line is output to make the shared memory stand by, and the cache memory is snooped.

特許文献５には、効率的に排他制御を調停することにより、タイムアウト等のシステムの処理に支障を来す事態の発生を防止するマルチプロセッサシステムに関する技術が開示されている。特許文献５にかかるマルチプロセッサシステムにおいて、プロセッサを搭載するカードと、カードを相互に接続するバスと、排他制御を実行中のプロセッサの存在を示す排他制御フラグと、プロセッサ毎に、排他制御の要求が失敗したことを示す失敗フラグと、および、再発行された排他制御要求が成功したことを示す成功フラグを備え、各カードからの排他制御要求を、他のカードが受け取り、共通の調停論理に基づき各カードのそれぞれが同一の要求処理順序を判定し実行する分散調停方式により排他制御の調停を行うことを特徴とする。 Patent Document 5 discloses a technique related to a multiprocessor system that prevents the occurrence of a situation that hinders the processing of a system such as a timeout by efficiently arbitrating exclusive control. In the multiprocessor system according to Patent Document 5, a card on which a processor is mounted, a bus that interconnects the cards, an exclusive control flag that indicates the presence of a processor that is executing exclusive control, and an exclusive control request for each processor And a success flag indicating that the reissued exclusive control request was successful, and the other card receives the exclusive control request from each card and puts it in the common arbitration logic. Based on the distributed arbitration method in which each card determines and executes the same request processing order based on each card, the exclusive control arbitration is performed.

特許文献６には、各ノードにローカルなメモリを有する複数のノードならびにすべてのノードを相互接続するタグ・アンド・アドレス・クロスバ・システムおよびデータ・クロスバ・システムに構成されたマルチプロセッサシステムに関する技術が開示されている。特許文献６にかかるマルチプロセッサシステムは、グローバル・スヌープを使用して、データ・タグの直列化の単一の点を提供する。中央クロスバ・コントローラが、すべてのノードの所与のアドレス・ラインのキャッシュ状態タグを同時に検査し、キャッシュ・コヒーレンスを維持し、要求されたデータを供給するためにシステム内の他のノードへの他のデータ要求を生成しながら、データを要求するノードに適当な応答を発行する。このシステムは、各ノードにローカルなメモリを、所与のキャッシュ・ラインについて相互に排他的なローカル・カテゴリおよびリモート・カテゴリに分割することによって、そのようなメモリを利用する。この開示では、各ノードの第３レベル・リモート・キャッシュのサポートを提供する。 Patent Document 6 discloses a technology related to a multiprocessor system configured in a tag and address crossbar system and a data crossbar system that interconnect a plurality of nodes having a local memory in each node and all the nodes. It is disclosed. The multiprocessor system according to U.S. Pat. No. 6,057,051 uses a global snoop to provide a single point of data tag serialization. The central crossbar controller examines the cache status tags for a given address line on all nodes simultaneously, maintains cache coherence, and others to the other nodes in the system to supply the requested data An appropriate response is issued to the node requesting the data while generating the data request. The system takes advantage of such memory by partitioning memory local to each node into mutually exclusive local and remote categories for a given cache line. This disclosure provides third level remote cache support for each node.

特開２０００−２５０８１１号公報JP 2000-250811 A 特開２０００−２５０８１２号公報JP 2000-250812 A 特開２００６−３２３４３２号公報JP 2006-323432 A 特開平１０−２２２４２３号公報Japanese Patent Laid-Open No. 10-222423 特許第３５６０５３４号公報Japanese Patent No. 3560534 特表２００５−５３９２８２号公報JP 2005-539282 Gazette

しかしながら、上述した特許文献１乃至６にかかる技術を用いても、マルチプロセッサシステムにおける複数プロセッサからあるひとつの主記憶上のキャッシュラインアドレスへのアクセスが同時期に発生した場合であるアドレス競合状態において、最後のアドレス競合リクエストを処理するまでにかかる時間が増大するという問題がある。その理由は、まず、アドレス競合状態にあるプロセッサ発行のリクエストを処理するにあたり、必ずキャッシュコヒーレンシ制御部から対象プロセッサへのスヌープリクエストのレイテンシと、そのコンプリーションのレイテンシがかかるためである。また、現状のマルチプロセッサシステムにおけるプロセッサ数は、数１０から数１００に達していることが一般的である。それにより、マルチプロセッサシステムにおけるプロセッサ数が多い場合、情報処理装置を構成するための物理的制約によりキャッシュコヒーレンシ制御部とプロセッサ間のレイテンシが増大するためである。なお、このようなアドレス競合処理は、マルチプロセッサシステムにおけるセマフォロックオペレーションにおいて一般的なこととして行われるものである。 However, even when the techniques according to Patent Documents 1 to 6 described above are used, in an address conflict state in which accesses to a cache line address on one main memory from a plurality of processors in a multiprocessor system occur at the same time. There is a problem that the time taken to process the last address conflict request increases. This is because, first, when processing a processor-issued request in an address conflict state, the latency of the snoop request from the cache coherency control unit to the target processor and the latency of its completion are always required. In addition, the number of processors in the current multiprocessor system generally reaches several tens to several hundreds. This is because when the number of processors in the multiprocessor system is large, the latency between the cache coherency control unit and the processors increases due to physical restrictions for configuring the information processing apparatus. Such address conflict processing is generally performed in semaphore lock operations in a multiprocessor system.

ここで、上述した課題の発生を以下に例示して説明する。まず、以下で対象とするＳＭＰ(Symmetric Multi-Processor)システムは、内部に主記憶データを一時的に格納するＭＥＳＩなどの一般的なプロトコルを採用するストアイン方式のキャッシュメモリを搭載するＣＰＵやＩ／Ｏプロセッサ複数個と、自己の管理対象主記憶データがどのＣＰＵ、Ｉ／Ｏプロセッサキャッシュに保持されているかを登録するディレクトリ方式によりＣＰＵやＩ／Ｏプロセッサのキャッシュ間コヒーレンシ保障を行うキャッシュコヒーレンシ制御回路と、前記キャッシュコヒーレンシ制御回路を介してＣＰＵ、Ｉ／Ｏプロセッサからのメモリアクセスリクエストを受け取り、配下の主記憶に書き込み、読み出し動作を指示するメモリ制御回路と、を備え、これらを１対１接続、バス接続、スター接続などの方式により接続したものである。 Here, generation | occurrence | production of the subject mentioned above is illustrated and demonstrated below. First, an SMP (Symmetric Multi-Processor) system to be described below is a CPU or an I / F equipped with a store-in cache memory that employs a general protocol such as MESI that temporarily stores main memory data therein. Cache coherency control that guarantees inter-cache coherency between CPUs and I / O processors using a directory system that registers a plurality of / O processors and the CPU and I / O processor caches in which the main storage data to be managed is stored. A memory control circuit that receives a memory access request from the CPU and the I / O processor via the cache coherency control circuit, writes it to a subordinate main memory, and instructs a read operation. Connected by connection, bus connection, star connection, etc. Than is.

ここで、複数プロセッサからあるひとつの主記憶上のキャッシュラインアドレスへのアクセスが同時期に発生した状態であるアドレス競合状態である場合を考える。この場合、上述した特許文献５では、ある特定のプロセッサがメモリアクセスできない所謂ライブロック状態を回避しつつシステム内のキャッシュコヒーレンシを保証するためには、あるひとつのキャッシュコヒーレンシ制御部を有することとなる。または、複数個所に分散しつつも完全に全てが同期して動作するキャッシュコヒーレンシ制御部を有することとなる。そして、このキャッシュコヒーレンシ制御部でアドレス競合しているリクエストをひとつひとつシーケンシャルに処理する必要があった。 Here, consider a case where there is an address conflict state in which accesses from a plurality of processors to a cache line address on one main memory are generated at the same time. In this case, in Patent Document 5 described above, in order to guarantee the cache coherency in the system while avoiding a so-called live block state in which a specific processor cannot access the memory, the cache coherency control unit is provided. . Alternatively, a cache coherency control unit that operates in a completely synchronized manner while being distributed at a plurality of locations is provided. The cache coherency control unit must sequentially process requests that have address conflicts one by one.

これによると、複数プロセッサが同時期に同一メモリアドレスに対して更新を目的とする主記憶リードリクエストを発行した場合、次のような動作となる。尚、ここでは、主記憶上にあるデータがシステム唯一のデータであるとする。 According to this, when a plurality of processors issue a main memory read request for updating the same memory address at the same time, the following operation is performed. Here, it is assumed that the data in the main memory is the only data in the system.

まず、上述したキャッシュコヒーレンシ制御部は、最初に処理すると決定されたリクエストの発行元プロセッサに対して主記憶読み出しデータを返却する。次に、キャッシュコヒーレンシ制御部は、２番目に処理すると決定したプロセッサのリクエストを処理するために、最初に処理したリクエストの発行元プロセッサにキャッシュ内データ掃き出し指示リクエスト（キャッシュスヌープリクエスト）を発行する。この結果、最初に処理したリクエストの発行元プロセッサからはキャッシュ内データとキャッシュスヌープリクエストに対するコンプリーションが発行される。 First, the cache coherency control unit described above returns the main memory read data to the processor that issued the request that is determined to be processed first. Next, in order to process the request of the processor that has been determined to be processed second, the cache coherency control unit issues an in-cache data sweep instruction request (cache snoop request) to the processor that issued the request that has been processed first. As a result, the completion processing for the cached data and the cache snoop request is issued from the processor that issued the request processed first.

そして、キャッシュ内データは、２番目に処理すると決定されたリクエストを発行したプロセッサに返却され、コンプリーションは、キャッシュコヒーレンシ制御部に返却される。 Then, the in-cache data is returned to the processor that issued the request determined to be processed second, and the completion is returned to the cache coherency control unit.

続いて、最初に処理したリクエストの発行元プロセッサからのコンプリーションを受け付けたキャッシュコヒーレンシ制御部は、最初に処理したリクエストの発行元プロセッサからキャッシュ内データが掃き出されたこと、及び、そのデータが２番目に処理すると決定したリクエストの発行元プロセッサに渡されたことを知る。そして、当該キャッシュコヒーレンシ制御部は、それを契機に３番目に処理すると決定したリクエストを処理するために、２番目に処理したリクエストの発行元プロセッサにキャッシュスヌープリクエストを発行する。以下、２番目のリクエストを処理したときと同じ動作が行われ、更に、競合リクエストを発行したプロセッサがあればそれらについて同じ動作が繰り返し行われる。このように、マルチプロセッサシステムのアドレス競合状態において、最後のアドレス競合リクエストを処理するまでにかかる時間が増大するという問題が生じる。 Subsequently, the cache coherency control unit that received the completion from the processor that issued the request that was processed first, that the data in the cache was swept out from the processor that issued the request that was processed first, and that the data It knows that it has been passed to the processor that issued the request decided to be processed second. The cache coherency control unit issues a cache snoop request to the processor that issued the second processed request in order to process the request that is determined to be processed third in response to the request. Thereafter, the same operation as when the second request is processed is performed, and further, if there is a processor that has issued a contention request, the same operation is repeated. As described above, in the address conflict state of the multiprocessor system, there arises a problem that the time taken to process the last address conflict request increases.

本発明は、このような問題点を解決するためになされたものであり、アドレス競合状態において、最後のアドレス競合リクエストを処理するまでにかかる時間を短縮することができるキャッシュメモリ制御システム及びキャッシュメモリの制御方法を提供することを目的とする。 The present invention has been made in order to solve such a problem, and a cache memory control system and a cache memory capable of reducing the time taken to process the last address conflict request in an address conflict state. It is an object to provide a control method.

本発明の第１の態様にかかるキャッシュメモリ制御システムは、それぞれキャッシュメモリを有する複数のプロセッサと接続される送受信制御部と、メインメモリへアクセスし、前記複数のプロセッサが有するキャッシュメモリ間の一貫性を保つコヒーレンシ制御部と、を備え、前記コヒーレンシ制御部は、前記複数のプロセッサから前記メインメモリへの少なくとも排他的データ読み出しを含む複数のアクセス要求における対象アドレスが競合する場合に、当該複数のアクセス要求の競合に関する情報である競合情報を含めた応答指示を前記送受信制御部へ送信し、前記送受信制御部は、前記応答指示に含まれる競合情報に基づき、前記排他的データ読み出しをアクセス要求としたプロセッサの中から決定された返信対象のプロセッサに対して、前記アクセス要求に対応するデータを返信し、引き続き、当該返信対象のプロセッサが有するキャッシュメモリ内のデータの取得を要求するスヌープ要求を送信する。 A cache memory control system according to a first aspect of the present invention includes: a transmission / reception control unit connected to a plurality of processors each having a cache memory; and access to a main memory, and consistency between cache memories of the plurality of processors. A coherency control unit that maintains a plurality of accesses in a case where target addresses in a plurality of access requests including at least exclusive data reading from the plurality of processors to the main memory compete with each other. A response instruction including contention information that is information related to request contention is transmitted to the transmission / reception control unit, and the transmission / reception control unit uses the exclusive data read as an access request based on the contention information included in the response instruction. For the return target processor determined from among the processors Te, and returns the data corresponding to the access request, subsequently, it transmits a snoop request for requesting acquisition of data in the cache memory of the processor of the reply subject has.

本発明の第２の態様にかかるキャッシュメモリの制御方法は、それぞれキャッシュメモリを有する複数のプロセッサと接続される送受信制御部と、メインメモリへアクセスし、前記複数のプロセッサが有するキャッシュメモリ間の一貫性を保つコヒーレンシ制御部と、を備えるマルチプロセッサシステムにおけるキャッシュメモリの制御方法であって、前記送受信制御部において、前記複数のプロセッサから前記メインメモリへの少なくとも排他的データ読み出しを含む複数のアクセス要求を受信し、前記コヒーレンシ制御部において、前記複数のアクセス要求における対象アドレスが競合する場合に、当該複数のアクセス要求の競合に関する情報である競合情報を含めた応答指示を前記送受信制御部へ送信し、前記送受信制御部において、前記応答指示に含まれる競合情報に基づき、前記排他的データ読み出しをアクセス要求としたプロセッサの中から決定された返信対象のプロセッサに対して、前記アクセス要求に対応するデータを返信し、引き続き、前記送受信制御部において、当該返信対象のプロセッサが有するキャッシュメモリ内のデータの取得を要求するスヌープ要求を送信する。 A cache memory control method according to a second aspect of the present invention includes: a transmission / reception control unit connected to a plurality of processors each having a cache memory; and access to a main memory, and consistency between the cache memories of the plurality of processors. A cache memory control method in a multiprocessor system comprising: a coherency control unit that keeps performance; and a plurality of access requests including at least exclusive data reading from the plurality of processors to the main memory in the transmission / reception control unit And when the target addresses in the plurality of access requests compete with each other, the coherency control unit transmits a response instruction including contention information that is information regarding the competition of the plurality of access requests to the transmission / reception control unit. In the transmission / reception control unit, Based on the contention information included in the response instruction, the data corresponding to the access request is returned to the return target processor determined from among the processors that made the exclusive data read access request. The transmission / reception control unit transmits a snoop request for requesting acquisition of data in the cache memory of the processor to be returned.

本発明により、アドレス競合状態において、最後のアドレス競合リクエストを処理するまでにかかる時間を短縮することができる。 According to the present invention, it is possible to reduce the time taken to process the last address conflict request in the address conflict state.

本発明の実施の形態１にかかるキャッシュメモリ制御システムの構成を示すブロック図である。It is a block diagram which shows the structure of the cache memory control system concerning Embodiment 1 of this invention. 本発明の実施の形態１にかかるキャッシュメモリの制御方法の流れを示すフローチャート図である。It is a flowchart figure which shows the flow of the control method of the cache memory concerning Embodiment 1 of this invention. 本発明の実施の形態２にかかるマルチプロセッサシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the multiprocessor system concerning Embodiment 2 of this invention. 本発明の実施の形態２で使用するキャッシュプロトコルのひとつであるMESIプロトコルのそれぞれのステータスの意味を説明したものである。The meaning of each status of the MESI protocol which is one of the cache protocols used in the second embodiment of the present invention will be described. 本発明の実施の形態２にかかるキャッシュステータス管理機能のために実装するディレクトリの構成図である。It is a block diagram of the directory mounted for the cache status management function concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるディレクトリキャッシュステータスの説明である。It is description of the directory cache status concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるキャッシングエージェント情報の説明である。It is description of the caching agent information concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるメッセージ類の説明である。It is description of the messages concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるプロセッサからのリクエスト及びレスポンス並びにプロセッサ内キャッシュのステータス遷移の一覧である。It is the list of the request and response from the processor concerning Embodiment 2 of this invention, and the status transition of the cache in a processor. 本発明の実施の形態２にかかるシステムインタフェースからのリクエスト及びレスポンス並びにプロセッサ内キャッシュのステータス遷移の一覧である。10 is a list of requests and responses from a system interface according to a second embodiment of the present invention, and status transition of an in-processor cache. 本発明の実施の形態２にかかる競合情報の構成を示すブロック図である。It is a block diagram which shows the structure of the competition information concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかる競合情報の内容の説明である。It is description of the content of the competition information concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるコヒーレンシ制御部におけるＥＢＲ、ＳＢＲ受信時のアドレス競合制御部の内部情報の遷移を示す表である。It is a table | surface which shows the transition of the internal information of the address conflict control part at the time of EBR and SBR reception in the coherency control part concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるコヒーレンシ制御部におけるレスポンス生成を説明する表である。It is a table | surface explaining the response production | generation in the coherency control part concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるレスポンスであるＲｓｐ＿Ｄａｔａ＿ＥおよびＲｓｐ＿Ｄａｔａ＿Ｓの内部構成を示すものである。It shows an internal configuration of Rsp_Data_E and Rsp_Data_S which are responses according to the second embodiment of the present invention. 本発明の実施の形態２にかかるレスポンスであるＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔの内部構成を示すものである。It shows an internal configuration of Rsp_Data_Cnflt which is a response according to the second exemplary embodiment of the present invention. 本発明の実施の形態２にかかるアドレス競合時のコヒーレンシ制御部の処理の流れを示すフローチャート図である。It is a flowchart figure which shows the flow of a process of the coherency control part at the time of address conflict concerning Embodiment 2 of this invention. 本発明の実施の形態２にかかるコヒーレンシ制御部がレスポンスであるＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを生成する処理の流れを示すフローチャート図である。It is a flowchart figure which shows the flow of a process in which the coherency control part concerning Embodiment 2 of this invention produces | generates Rsp_Data_Cnflt which is a response. 本発明の実施の形態２にかかるレスポンス／スヌープ制御部がＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを受信した場合における処理の前半の流れを示すフローチャート図である。It is a flowchart figure which shows the flow of the first half of a process in case the response / snoop control part concerning Embodiment 2 of this invention receives Rsp_Data_Cnflt. 本発明の実施の形態２にかかるレスポンス／スヌープ制御部がＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを受信した場合における処理の後半の流れを示すフローチャート図である。It is a flowchart figure which shows the flow of the latter half of a process when the response / snoop control part concerning Embodiment 2 of this invention receives Rsp_Data_Cnflt.

以下では、本発明を適用した具体的な実施の形態について、図面を参照しながら詳細に説明する。各図面において、同一要素には同一の符号が付されており、説明の明確化のため、必要に応じて重複説明は省略する。 Hereinafter, specific embodiments to which the present invention is applied will be described in detail with reference to the drawings. In the drawings, the same elements are denoted by the same reference numerals, and redundant description will be omitted as necessary for the sake of clarity.

＜発明の実施の形態１＞
図１は、本発明の実施の形態１にかかるキャッシュメモリ制御システムの構成を示すブロック図である。キャッシュメモリ制御システム１５は、プロセッサ１１、プロセッサ１２及びメインメモリ１８と接続され、プロセッサ１１及び１２がそれぞれ有するキャッシュメモリ１３及び１４の管理を行うものである。 <Embodiment 1 of the Invention>
FIG. 1 is a block diagram showing the configuration of the cache memory control system according to the first embodiment of the present invention. The cache memory control system 15 is connected to the processor 11, the processor 12, and the main memory 18, and manages the cache memories 13 and 14 included in the processors 11 and 12, respectively.

プロセッサ１１は、キャッシュメモリ１３を搭載するＣＰＵ及びＩ／Ｏプロセッサである。例えば、ＭＥＳＩプロトコルなどの一般的なキャッシュプロトコルを採用するストアイン方式のリクエスタである。プロセッサ１２は、キャッシュメモリ１４を搭載するＣＰＵ及びＩ／Ｏプロセッサであり、その他は、プロセッサ１１と同様である。尚、本発明の実施の形態１にかかるプロセッサの数は、少なくとも２以上であればよい。 The processor 11 is a CPU and an I / O processor on which the cache memory 13 is mounted. For example, it is a store-in type requester that employs a general cache protocol such as the MESI protocol. The processor 12 is a CPU and an I / O processor on which the cache memory 14 is mounted, and the others are the same as the processor 11. Note that the number of processors according to the first exemplary embodiment of the present invention may be at least two.

キャッシュメモリ制御システム１５は、送受信制御部１６及びコヒーレンシ制御部１７を備える。送受信制御部１６は、プロセッサ１１及び１２と接続され、プロセッサ１１及び１２からの要求を受信し、当該要求をコヒーレンシ制御部１７へ転送する。また、送受信制御部１６は、コヒーレンシ制御部１７からの指示を受信し、所定の応答をプロセッサ１１又は１２へ送信する。コヒーレンシ制御部１７は、プロセッサ１１及び１２からメインメモリ１８への少なくとも排他的データ読み出しを含む複数のアクセス要求における対象アドレスが競合する場合に、当該複数のアクセス要求の競合に関する情報である競合情報を含めた応答指示を送受信制御部１６へ送信する。そして、送受信制御部１６は、コヒーレンシ制御部１７からの応答指示に含まれる競合情報に基づき、排他的データ読み出しをアクセス要求としたプロセッサの中から決定された返信対象のプロセッサに対して、アクセス要求に対応するデータを返信し、引き続き、当該返信対象のプロセッサが有するキャッシュメモリ内のデータの取得を要求するスヌープ要求を送信する。 The cache memory control system 15 includes a transmission / reception control unit 16 and a coherency control unit 17. The transmission / reception control unit 16 is connected to the processors 11 and 12, receives requests from the processors 11 and 12, and transfers the requests to the coherency control unit 17. In addition, the transmission / reception control unit 16 receives an instruction from the coherency control unit 17 and transmits a predetermined response to the processor 11 or 12. When the target addresses in a plurality of access requests including at least exclusive data reading from the processors 11 and 12 to the main memory 18 compete with each other, the coherency control unit 17 displays contention information that is information regarding contention of the plurality of access requests. The included response instruction is transmitted to the transmission / reception control unit 16. Then, the transmission / reception control unit 16 sends an access request to the return target processor determined from among the processors having the exclusive data read access request based on the contention information included in the response instruction from the coherency control unit 17. Is returned, and subsequently a snoop request for requesting acquisition of data in the cache memory of the processor to be returned is transmitted.

メインメモリ１８は、コヒーレンシ制御部１７からの対象アドレスを指定したデータ読み出し要求に応じて、対象アドレスにおけるデータを返信する。また、メインメモリ１８は、コヒーレンシ制御部１７からの対象アドレスを指定したデータ書き込み要求に応じて、対象アドレスにおけるデータを更新し、その旨を返信する。 The main memory 18 returns data at the target address in response to a data read request specifying the target address from the coherency control unit 17. Further, the main memory 18 updates the data at the target address in response to the data write request specifying the target address from the coherency control unit 17, and returns a message to that effect.

図２は、本発明の実施の形態１にかかるキャッシュメモリの制御方法の流れを示すフローチャート図である。 FIG. 2 is a flowchart showing the flow of the cache memory control method according to the first embodiment of the present invention.

まず、送受信制御部１６は、プロセッサ１１及び１２からメインメモリ１８への複数のアクセス要求を受信する（Ｓ１１）。ここで、複数のアクセス要求には、少なくとも排他的データ読み出しを含むものとする。また、複数のアクセス要求は、メインメモリ１８内の同一のアドレスを対象としたものとする。尚、送受信制御部１６は、受信したアクセス要求をコヒーレンシ制御部１７へ転送する。 First, the transmission / reception control unit 16 receives a plurality of access requests to the main memory 18 from the processors 11 and 12 (S11). Here, it is assumed that the plurality of access requests include at least exclusive data reading. Further, it is assumed that a plurality of access requests are directed to the same address in the main memory 18. The transmission / reception control unit 16 transfers the received access request to the coherency control unit 17.

次に、コヒーレンシ制御部１７は、複数のアクセス要求における対象アドレスが競合する場合に、当該複数のアクセス要求の競合に関する情報である競合情報を含めた応答指示を送受信制御部１６へ送信する（Ｓ１２）。 Next, when the target addresses in the plurality of access requests compete, the coherency control unit 17 transmits a response instruction including contention information that is information regarding the competition of the plurality of access requests to the transmission / reception control unit 16 (S12). ).

そして、送受信制御部１６は、応答指示に含まれる競合情報に基づき、排他的データ読み出しをアクセス要求としたプロセッサの中から決定された返信対象のプロセッサに対して、アクセス要求に対応するデータを返信する（Ｓ１３）。 Then, the transmission / reception control unit 16 returns the data corresponding to the access request to the reply target processor determined from among the processors having the exclusive data read as the access request based on the conflict information included in the response instruction. (S13).

引き続き、送受信制御部１６は、当該返信対象のプロセッサが有するキャッシュメモリ内のデータの取得を要求するスヌープ要求を送信する（Ｓ１４）。 Subsequently, the transmission / reception control unit 16 transmits a snoop request for requesting acquisition of data in the cache memory of the processor to be returned (S14).

これにより、送受信制御部１６からコヒーレンシ制御部１７へのコンプリーション及びコヒーレンシ制御部１７から送受信制御部１６へのスヌープの送受信の時間を短縮できる。このように、本発明の実施の形態１により、アドレス競合状態において、最後のアドレス競合リクエストを処理するまでにかかる時間を短縮することができる。 Thereby, the completion time from the transmission / reception control unit 16 to the coherency control unit 17 and the snoop transmission / reception time from the coherency control unit 17 to the transmission / reception control unit 16 can be shortened. As described above, according to the first embodiment of the present invention, it is possible to reduce the time taken to process the last address conflict request in the address conflict state.

＜発明の実施の形態２＞
図３は、本発明の実施の形態２にかかるマルチプロセッサシステム１００の構成を示すブロック図である。マルチプロセッサシステム１００は、上述した本発明の実施の形態１にかかるキャッシュメモリ制御システムの一例である。マルチプロセッサシステム１００は、２つのノードコントローラに複数のプロセッサを１対１接続する例を示すが、ノードコントローラとプロセッサとの接続方法は、これに限定されない。例えば、ノードコントローラとプロセッサとの接続方法は、バス接続やスター接続等であっても構わない。 <Embodiment 2 of the Invention>
FIG. 3 is a block diagram showing a configuration of the multiprocessor system 100 according to the second embodiment of the present invention. The multiprocessor system 100 is an example of the cache memory control system according to the first embodiment of the present invention described above. Although the multiprocessor system 100 shows an example in which a plurality of processors are connected to two node controllers on a one-to-one basis, the method of connecting the node controller and the processor is not limited to this. For example, the connection method between the node controller and the processor may be bus connection or star connection.

プロセッサ１０１乃至１０８は、それぞれＭＥＳＩプロトコルなどの一般的なキャッシュプロトコルを採用するストアイン方式のキャッシュメモリ１１１乃至１１８を搭載するＣＰＵ及びＩ／Ｏプロセッサである。プロセッサ１０１乃至１０８は、主記憶アクセスリクエストのリクエスタとなるものである。尚、本発明の実施の形態２で用いられるプロセッサ１０１乃至１０８は、公知なものであるため、詳細な説明を省略する。尚、プロセッサ数は、これに限定されず、少なくとも２以上であればよい。図４は、本発明の実施の形態２で使用するキャッシュプロトコルの一例であるＭＥＳＩプロトコルのそれぞれのステータスの意味を説明したものである。 The processors 101 to 108 are CPUs and I / O processors equipped with store-in cache memories 111 to 118 that employ a general cache protocol such as the MESI protocol. The processors 101 to 108 serve as requesters for main memory access requests. Since the processors 101 to 108 used in the second embodiment of the present invention are known, detailed description thereof will be omitted. The number of processors is not limited to this, and may be at least two. FIG. 4 explains the meaning of each status of the MESI protocol, which is an example of the cache protocol used in the second embodiment of the present invention.

主記憶管理システム１５１は、主記憶１５３及び主記憶コントローラ１５５を備える。また、主記憶管理システム１５２は、主記憶１５４及び主記憶コントローラ１５６を備える。主記憶管理システム１５１及び１５２は、コヒーレンシ制御部１４３及び１４４を介して受け付けたプロセッサ１０１乃至１０８からの主記憶アクセスリクエストに対して適切に主記憶データの読み出し、主記憶データの更新を行う。尚、本発明の実施の形態２で用いられる主記憶管理システム１５１及び１５２は、公知なものであるため、詳細な説明を省略する。 The main memory management system 151 includes a main memory 153 and a main memory controller 155. The main memory management system 152 includes a main memory 154 and a main memory controller 156. The main memory management systems 151 and 152 appropriately read main memory data and update main memory data in response to main memory access requests from the processors 101 to 108 received via the coherency controllers 143 and 144. Since the main memory management systems 151 and 152 used in the second embodiment of the present invention are known, detailed description thereof is omitted.

ノードコントローラ１２１は、プロセッサ１０１乃至１０４と接続され、主記憶管理システム１５１へのアクセス制御を行う。また、ノードコントローラ１２１は、ノードコントローラ１２２を介して、プロセッサ１０５乃至１０８から主記憶管理システム１５１へのアクセス制御を行い、プロセッサ１０１乃至１０４主記憶管理システム１５２へのアクセス要求を送信する。ノードコントローラ１２１は、レスポンス／スヌープ制御部１３１乃至１３４、クロスバー１４１及びコヒーレンシ制御部１４３を備える。 The node controller 121 is connected to the processors 101 to 104 and controls access to the main memory management system 151. Further, the node controller 121 performs access control from the processors 105 to 108 to the main memory management system 151 via the node controller 122, and transmits an access request to the processors 101 to 104 main memory management system 152. The node controller 121 includes response / snoop control units 131 to 134, a crossbar 141, and a coherency control unit 143.

同様に、ノードコントローラ１２２は、プロセッサ１０５乃至１０８と接続され、主記憶管理システム１５２へのアクセス制御を行う。また、ノードコントローラ１２２は、ノードコントローラ１２１を介して、プロセッサ１０１乃至１０４から主記憶管理システム１５２へのアクセス制御を行い、プロセッサ１０５乃至１０８主記憶管理システム１５１へのアクセス要求を送信する。ノードコントローラ１２２は、レスポンス／スヌープ制御部１３５乃至１３８、クロスバー１４２及びコヒーレンシ制御部１４４を備える。尚、本発明の実施の形態２で用いられるノードコントローラは、２個に限定されない。 Similarly, the node controller 122 is connected to the processors 105 to 108 and performs access control to the main memory management system 152. Further, the node controller 122 performs access control from the processors 101 to 104 to the main memory management system 152 via the node controller 121 and transmits an access request to the processors 105 to 108 main memory management system 151. The node controller 122 includes response / snoop control units 135 to 138, a crossbar 142, and a coherency control unit 144. Note that the number of node controllers used in Embodiment 2 of the present invention is not limited to two.

クロスバー１４１は、一般的なクロスバー回路であり、レスポンス／スヌープ制御部１３１乃至１３４、コヒーレンシ制御部１４３及びクロスバー１４２を接続する。同様に、クロスバー１４２は、一般的なクロスバー回路であり、レスポンス／スヌープ制御部１３５乃至１３８、コヒーレンシ制御部１４４及びクロスバー１４１を接続する。 The crossbar 141 is a general crossbar circuit, and connects the response / snoop control units 131 to 134, the coherency control unit 143, and the crossbar 142. Similarly, the crossbar 142 is a general crossbar circuit, and connects the response / snoop control units 135 to 138, the coherency control unit 144, and the crossbar 141.

レスポンス／スヌープ制御部１３１は、プロセッサ１０１と直接接続され、プロセッサ１０１が発行した主記憶アクセスリクエストに対し、所定の応答を行う。また、レスポンス／スヌープ制御部１３１は、クロスバー１４１を介してコヒーレンシ制御部１４３、レスポンス／スヌープ制御部１３２乃至１３４並びにノードコントローラ１２２と通信可能である。 The response / snoop control unit 131 is directly connected to the processor 101 and makes a predetermined response to the main memory access request issued by the processor 101. Further, the response / snoop control unit 131 can communicate with the coherency control unit 143, the response / snoop control units 132 to 134 and the node controller 122 via the crossbar 141.

ここで、特に、プロセッサ１０１を含む複数のプロセッサから主記憶１５３の同一のアドレスへのアクセス要求がされている場合、つまり、アドレス競合の場合について説明する。また、ここでは、プロセッサ１０１からの主記憶アクセスリクエストは、排他的データ読み出しであるものとする。この場合、レスポンス／スヌープ制御部１３１は、コヒーレンシ制御部１４３もしくは１４４又は他のレスポンス／スヌープ制御部１３２乃至１３８のいずれかから応答指示を受信する。そして、レスポンス／スヌープ制御部１３１は、当該応答指示に含まれる競合情報に基づき、プロセッサ１０１に対して当該主記憶アクセスリクエストに対応するデータを返信し、引き続き、プロセッサ１０１が有するキャッシュメモリ１１１内のデータの取得を要求するスヌープ要求を送信する。尚、競合情報については、後述する。 Here, in particular, a case where access requests to the same address in the main memory 153 are made from a plurality of processors including the processor 101, that is, an address conflict will be described. Here, it is assumed that the main memory access request from the processor 101 is exclusive data reading. In this case, the response / snoop control unit 131 receives a response instruction from either the coherency control unit 143 or 144 or the other response / snoop control units 132 to 138. Then, the response / snoop control unit 131 returns data corresponding to the main memory access request to the processor 101 based on the contention information included in the response instruction, and then continues in the cache memory 111 of the processor 101. Send a snoop request to request data acquisition. The competition information will be described later.

また、レスポンス／スヌープ制御部１３１は、アドレス競合検出時レスポンスに含まれる競合情報に基づき、返信対象であるプロセッサ１０１の次に返信対象となるプロセッサを決定し、プロセッサ１０１からのスヌープ要求に対する応答を受信した後に、当該次に返信対象となるプロセッサに対して、アクセス要求に対応するデータを返信する。これにより、コヒーレンシ制御部１４３又は１４４を介さずに、連続してアドレス競合するプロセッサへデータ返信することができ、コヒーレンシ制御部１４３又は１４４との送受信時間を短縮することができる。 Further, the response / snoop control unit 131 determines a processor to be returned next to the processor 101 to be returned based on the conflict information included in the response at the time of address conflict detection, and sends a response to the snoop request from the processor 101. After the reception, the data corresponding to the access request is returned to the processor to be returned next. As a result, data can be continuously sent back to the processor having address conflict without using the coherency control unit 143 or 144, and the transmission / reception time with the coherency control unit 143 or 144 can be shortened.

尚、レスポンス／スヌープ制御部１３２乃至１３８は、レスポンス／スヌープ制御部１３１と同様に動作するため、詳細な説明を省略する。 Since the response / snoop control units 132 to 138 operate in the same manner as the response / snoop control unit 131, detailed description thereof is omitted.

コヒーレンシ制御部１４３は、接続される主記憶管理システム１５１内の主記憶１５３及び主記憶コントローラ１５５へアクセスし、プロセッサ１０１乃至１０８が有するキャッシュメモリ１１１乃至１１８の一貫性を保つ。すなわち、コヒーレンシ制御部１４３は、主記憶管理システム１５１向けの主記憶アクセスリクエストが制御対象である。コヒーレンシ制御部１４３は、クロスバー１４１を介して、プロセッサ１０１乃至１０８からのアクセス要求を受信し、アクセス要求に応じて、主記憶管理システム１５１へアクセスする。また、コヒーレンシ制御部１４３は、クロスバー１４１を介して、主記憶管理システム１５１からのアクセス結果を所定のプロセッサへ返信する。コヒーレンシ制御部１４３は、キャッシュステータス管理機能１４５及びアドレス競合制御部１４７を備える。 The coherency control unit 143 accesses the main memory 153 and the main memory controller 155 in the connected main memory management system 151, and maintains the consistency of the cache memories 111 to 118 included in the processors 101 to 108. That is, the coherency control unit 143 controls the main memory access request for the main memory management system 151. The coherency control unit 143 receives access requests from the processors 101 to 108 via the crossbar 141, and accesses the main memory management system 151 in response to the access requests. Also, the coherency control unit 143 returns the access result from the main memory management system 151 to a predetermined processor via the crossbar 141. The coherency control unit 143 includes a cache status management function 145 and an address conflict control unit 147.

キャッシュステータス管理機能１４５は、ディレクトリ方式などの一般的なシステム内キャッシュステータス管理機能を実現する回路である。図５は、本発明の実施の形態２にかかるキャッシュステータス管理機能のために実装するディレクトリの構成の一例を示す図である。以下の説明においては、全ての主記憶をキャッシュラインサイズごとに管理可能なフルディレクトリ方式を採用するものとする。すなわち、各エントリは、かならず唯一の主記憶アドレスに対応するものとするが、セットアソシアティブ方式などを採用し同時には全主記憶アドレスを管理しないディレクトリ方式を用いても本発明は実現可能である。キャッシュステータス管理機能１４５は、エントリごとに、キャッシュステータス３０１とキャッシングエージェント情報３０２とを関連付けた情報を保持する。図６は、本発明の実施の形態２にかかるディレクトリのキャッシュステータス３０１の説明である。キャッシュステータス３０１は、主記憶アドレスのキャッシュステータス情報であり、システム内のプロセッサがどのようなキャッシュステータスでそのデータを保有しているかを示すものである。また、図７は、本発明の実施の形態２にかかるキャッシングエージェント情報３０２の説明である。キャッシングエージェント情報３０２は、キャッシュステータスでデータを保有しているプロセッサを示すキャッシングエージェント情報である。尚、キャッシュステータス管理機能１４５は、これに限定されず、他のものであっても構わない。 The cache status management function 145 is a circuit that realizes a general in-system cache status management function such as a directory system. FIG. 5 is a diagram showing an example of a directory configuration implemented for the cache status management function according to the second embodiment of the present invention. In the following description, it is assumed that a full directory system capable of managing all main memories for each cache line size is adopted. That is, each entry always corresponds to a unique main memory address, but the present invention can be realized even if a directory system that uses a set associative method and does not manage all main memory addresses at the same time. The cache status management function 145 holds information that associates the cache status 301 with the caching agent information 302 for each entry. FIG. 6 is an explanation of the directory cache status 301 according to the second embodiment of the present invention. The cache status 301 is cache status information of the main storage address, and indicates what cache status the processor in the system holds the data. FIG. 7 illustrates the caching agent information 302 according to the second embodiment of the present invention. The caching agent information 302 is caching agent information indicating a processor that holds data with a cache status. Note that the cache status management function 145 is not limited to this, and other functions may be used.

アドレス競合制御部１４７は、コヒーレンシ制御部１４３が受信したプロセッサ１０１乃至１０８からのアクセス要求について、アクセス対象となる対象アドレスごとにアクセス要求の競合に関する情報である競合情報を保持する。つまり、アドレス競合制御部１４７は、対象アドレスごとに競合情報を格納する競合情報記憶手段を有する。 The address contention control unit 147 holds contention information, which is information related to access request contention, for each target address to be accessed, for the access requests from the processors 101 to 108 received by the coherency control unit 143. That is, the address conflict control unit 147 includes a conflict information storage unit that stores conflict information for each target address.

コヒーレンシ制御部１４３は、複数のプロセッサから主記憶１５３への少なくとも排他的データ読み出しを含む複数のアクセス要求における対象アドレスが競合する場合に、当該複数のアクセス要求の競合に関する情報である競合情報を含めた応答指示をクロスバー１４１へ送信する。 The coherency control unit 143 includes contention information that is information regarding contention of the plurality of access requests when the target addresses in the plurality of access requests including at least exclusive data read from the plurality of processors to the main memory 153 compete. The response instruction is transmitted to the crossbar 141.

また、コヒーレンシ制御部１４３は、競合情報に基づき、最初の返信対象のプロセッサを決定し、当該決定した返信対象プロセッサへの応答指示としてレスポンス／スヌープ制御部１３１乃至１３８のいずれかへ送信する。これにより、本発明の実施の形態２にかかるマルチプロセッサシステム１００のように、複数のレスポンス／スヌープ制御部が複数のプロセッサと一対一接続された場合、競合情報に基づき最初の返信対象を決定することにより、適切な応答先へ送信することができる。 Further, the coherency control unit 143 determines the first reply target processor based on the competition information, and transmits it to one of the response / snoop control units 131 to 138 as a response instruction to the determined reply target processor. Thus, as in the multiprocessor system 100 according to the second embodiment of the present invention, when a plurality of response / snoop control units are connected one-to-one with a plurality of processors, the first reply target is determined based on the conflict information. Thus, it can be transmitted to an appropriate response destination.

尚、コヒーレンシ制御部１４４並びにコヒーレンシ制御部１４４が備えるキャッシュステータス管理機能１４６及びアドレス競合制御部１４８は、上述したコヒーレンシ制御部１４３、キャッシュステータス管理機能１４５及びアドレス競合制御部１４７と同等であるため、詳細な説明を省略する。 Note that the coherency control unit 144 and the cache status management function 146 and the address conflict control unit 148 included in the coherency control unit 144 are equivalent to the coherency control unit 143, the cache status management function 145, and the address conflict control unit 147 described above. Detailed description is omitted.

システムインタフェース１６１乃至１６８は、それぞれプロセッサ１０１乃至１０８とレスポンス／スヌープ制御部１３１乃至１３８とを接続するシステムインタフェースである。システムインタフェース１６１乃至１６８は、プロセッサ１０１乃至１０８から主記憶１５３及び１５４へのアクセスリクエスト及びそのデータリプライ、ノードコントローラ１２１及び１２２からプロセッサ１０１乃至１０８への主記憶データのキャッシング状態問い合わせリクエスト（以降、スヌープリクエストと呼ぶ）及びそのリプライの受け渡しを行う。 System interfaces 161 to 168 are system interfaces for connecting the processors 101 to 108 and the response / snoop control units 131 to 138, respectively. The system interfaces 161 to 168 send access requests to the main memories 153 and 154 from the processors 101 to 108 and data replies thereof, and cache state inquiry requests for main memory data from the node controllers 121 and 122 to the processors 101 to 108 (hereinafter, snoops). (Referred to as a request) and its reply.

インタフェース１７１乃至１７４は、レスポンス／スヌープ制御部１３１乃至１３４とクロスバー１４１とを接続するノードコントローラ１２１内のインタフェースである。インタフェース１８１は、クロスバー１４１とコヒーレンシ制御部１４３とを接続するノードコントローラ１２１内のインタフェースである。 Interfaces 171 to 174 are interfaces in the node controller 121 that connect the response / snoop control units 131 to 134 and the crossbar 141. The interface 181 is an interface in the node controller 121 that connects the crossbar 141 and the coherency control unit 143.

また、インタフェース１７５乃至１７８は、レスポンス／スヌープ制御部１３５乃至１３８とクロスバー１４２とを接続するノードコントローラ１２２内のインタフェースである。インタフェース１８２は、クロスバー１４２とコヒーレンシ制御部１４４とを接続するノードコントローラ１２２内のインタフェースである。 Interfaces 175 to 178 are interfaces in the node controller 122 that connect the response / snoop control units 135 to 138 and the crossbar 142. The interface 182 is an interface in the node controller 122 that connects the crossbar 142 and the coherency control unit 144.

メモリインタフェース１８３は、主記憶管理システム１５１とコヒーレンシ制御部１４３とを接続するメモリインタフェースである。メモリインタフェース１８３は、コヒーレンシ制御部１４３からの主記憶アクセスリクエスト及び主記憶管理システム１５１からのデータリプライの受け渡しを行う。 The memory interface 183 is a memory interface that connects the main memory management system 151 and the coherency control unit 143. The memory interface 183 transfers a main memory access request from the coherency control unit 143 and a data reply from the main memory management system 151.

また、メモリインタフェース１８４は、主記憶管理システム１５２とコヒーレンシ制御部１４４とを接続するメモリインタフェースである。メモリインタフェース１８４は、コヒーレンシ制御部１４４からの主記憶アクセスリクエスト及び主記憶管理システム１５２からのデータリプライの受け渡しを行う。 The memory interface 184 is a memory interface that connects the main memory management system 152 and the coherency control unit 144. The memory interface 184 delivers a main memory access request from the coherency control unit 144 and a data reply from the main memory management system 152.

ノードインタフェース１８５は、ノードコントローラ１２１内のクロスバー１４１と、ノードコントローラ１２２内のクロスバー１４２とを接続するインタフェースである。ノードインタフェース１８５は、異なるノードコントローラ上に搭載されるレスポンス／スヌープ制御部及びコヒーレンシ制御部間の主記憶アクセスリクエスト及びそのデータリプライ並びにスヌープリクエスト及びそのリプライの受け渡しを行う。 The node interface 185 is an interface that connects the crossbar 141 in the node controller 121 and the crossbar 142 in the node controller 122. The node interface 185 exchanges the main memory access request and its data reply and the snoop request and its reply between the response / snoop control unit and the coherency control unit mounted on different node controllers.

ここで、本発明の実施の形態２にかかるメッセージ類の種類及び説明を図８に示す。図８は、本発明の実施の形態２にかかるマルチプロセッサシステム１００内を実現するにあたり必要となるプロセッサ、レスポンス／スヌープ制御部、クロスバー及びコヒーレンシ制御部間、並びに、コヒーレンシ制御部及び主記憶管理システム間のリクエストメッセージや、レスポンス及びコンプリーションメッセージを示したものである。特に、図８に示すＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔは、Ｒｓｐ＿Ｄａｔａ＿Ｅ、Ｒｓｐ＿Ｄａｔａ＿Ｓに加え、コヒーレンシ制御部で検出したＥＢＲの同一アドレス競合状態、ＳＢＲの同一アドレス競合状態及び競合するＥＢＲ又はＳＢＲのリクエストＩＤ情報を有するレスポンスメッセージである。 Here, the types and explanation of messages according to the second embodiment of the present invention are shown in FIG. FIG. 8 shows a processor, response / snoop control unit, crossbar and coherency control unit, coherency control unit, and main memory management required for realizing the inside of the multiprocessor system 100 according to the second embodiment of the present invention. A request message between systems, a response and a completion message are shown. In particular, Rsp_Data_Cnflt shown in FIG. 8 is a response message having, in addition to Rsp_Data_E and Rsp_Data_S, the same address conflict state of EBR detected by the coherency controller, the same address conflict state of SBR, and the request ID information of conflicting EBR or SBR. .

図９は、本発明の実施の形態２にかかるプロセッサからのリクエスト及びレスポンス並びにプロセッサ内キャッシュのステータス遷移の一覧である。また、図１０は、本発明の実施の形態２にかかるシステムインタフェースからのリクエスト及びレスポンス並びにプロセッサ内キャッシュのステータス遷移の一覧である。 FIG. 9 is a list of requests and responses from the processor and status transitions of the cache in the processor according to the second embodiment of the present invention. FIG. 10 is a list of requests and responses from the system interface according to the second embodiment of the present invention and status transitions of the processor cache.

例えば、図９の９０３は、あるプロセッサがＥＢＲを発行した際に、その発行元プロセッサの初期のキャッシュステータスはＳであり、この場合、受信する可能性のあるレスポンスはＲｓｐ＿Ｄａｔａ＿Ｅとなることを示す。これを受信したプロセッサは、それに付与されているデータを自キャッシュに格納し、そのステータスをＥとして処理を完了する。 For example, reference numeral 903 in FIG. 9 indicates that when a certain processor issues an EBR, the initial cache status of the issuing processor is S, and in this case, a response that may be received is Rsp_Data_E. Upon receiving this, the processor stores the data attached thereto in its own cache, completes the process with the status as E.

尚、図９の９０１、９０２、９０４、９０６乃至９０９及び９１３乃至９１５は、リクエストとステータスの組み合わせとして挙げたものであるが、実際上は、発生し得ない組み合わせである。 Although 901, 902, 904, 906 to 909, and 913 to 915 in FIG. 9 are listed as combinations of requests and statuses, they are combinations that cannot actually occur.

また、図１０の９１６は、あるプロセッサ（仮にＡ）とは異なるプロセッサ（仮にＢ）が発行したＥＢＲがキャッシュコヒーレンシ制御部に受信され、その中のディレクトリを索引した結果プロセッサＡがデータをＭで保持しているということが判明した場合の状態遷移を示すものである。そして、それを条件としてプロセッサＡに対してＳｎｐ＿Ｅが発行された場合を示す。このとき、プロセッサＡのキャッシュステータスはＭであるから、このキャッシュステータスはＩに遷移し、同時に、システムインタフェースにＣｍｐ＿ＩとＲｓｐ＿Ｄａｔａ＿Ｅを発行し処理を完了する。 916 in FIG. 10 indicates that an EBR issued by a processor (provisionally B) different from a certain processor (provisionally A) is received by the cache coherency control unit, and as a result of indexing the directories in the processor A, the processor A stores the data as M This shows the state transition when it is found that the data is held. Then, a case where Snp_E is issued to the processor A on the condition is shown. At this time, since the cache status of the processor A is M, the cache status transits to I, and at the same time, Cmp_I and Rsp_Data_E are issued to the system interface to complete the processing.

なお、上述した課題で挙げたＳＭＰシステムでは、Ｃｍｐ＿Ｉはコヒーレンシ制御部に転送されプロセッサＡでの処理完了を知り、次の同一アドレスリクエストの処理を開始するきっかけとなり、Ｒｓｐ＿Ｄａｔａ＿ＥはプロセッサＢに返却される。これに比べ、本発明の実施の形態２では、プロセッサＡを管理するレスポンス/スヌープ管理部が、Ｃｍｐ＿ＩとＲｓｐ＿Ｄａｔａ＿Ｅの両方を受け付け、そこからＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを生成し、プロセッサＢに転送する。これにより、キャッシュステータスを遷移させ、キャッシュメモリの一貫性を保つことができる。つまり、コヒーレンシ制御部へのＣｍｐ＿Ｉ転送は行わないため、転送時間分の処理時間を短縮することができる。 In the SMP system mentioned in the above-mentioned problem, Cmp_I is transferred to the coherency control unit, knows the completion of processing in the processor A, triggers the start of processing of the next same address request, and Rsp_Data_E is returned to the processor B. . In contrast, in the second embodiment of the present invention, the response / snoop management unit that manages the processor A receives both Cmp_I and Rsp_Data_E, generates Rsp_Data_Cnflt therefrom, and transfers it to the processor B. As a result, the cache status can be changed, and the consistency of the cache memory can be maintained. That is, since the Cmp_I transfer to the coherency control unit is not performed, the processing time corresponding to the transfer time can be shortened.

図１１は、本発明の実施の形態２にかかる競合情報の構成を示すブロック図である。図１２は、本発明の実施の形態２にかかる競合情報の内容の説明である。競合情報は、Ｖａｌｉｄビット７０１、ＥＢＲ競合検出フラグ７０２、ＥＢＲ競合検出フラグ７０３、リクエストＩＤ７０４を対象アドレス単位に関連付けたものである。Ｖａｌｉｄビット７０１は、１のとき、当該セットが使用中であることを示す。ＥＢＲ競合検出フラグ７０２は、コヒーレンシ制御部がＥＢＲを受け取ったならば、そのＥＢＲのリクエストソースに対応するビットを１にセットされる。ＥＢＲ競合検出フラグ７０３は、コヒーレンシ制御部がＳＢＲを受け取ったならば、そのＳＢＲのリクエストソースに対応するビットを１にセットされる。リクエストＩＤ７０４は、ＥＢＲ競合検出フラグ７０２及びＥＢＲ競合検出フラグ７０３をセットされると同時に、当該リクエストのリクエストＩＤを設定、保持されるためのフィールドである。 FIG. 11 is a block diagram showing a configuration of contention information according to the second exemplary embodiment of the present invention. FIG. 12 is a diagram for explaining the content of contention information according to the second embodiment of the present invention. The conflict information is obtained by associating the Valid bit 701, the EBR conflict detection flag 702, the EBR conflict detection flag 703, and the request ID 704 with each target address. When the Valid bit 701 is 1, it indicates that the set is in use. The EBR conflict detection flag 702 is set to 1 when the coherency control unit receives an EBR, and the bit corresponding to the request source of the EBR. If the coherency control unit receives an SBR, the EBR conflict detection flag 703 is set to 1 in the bit corresponding to the request source of the SBR. The request ID 704 is a field for setting and holding the request ID of the request at the same time that the EBR conflict detection flag 702 and the EBR conflict detection flag 703 are set.

尚、図１１に示すのは、１つの主記憶アドレスに対応する１セットの図であり、競合情報は、セット単位で主記憶アドレスを管理される。つまり、競合情報は、メインメモリのアドレス単位に、競合有無フラグ、アクセス要求の種別及び当該アクセス要求の識別情報を含む。これにより、アドレスの競合関係を適切に管理が可能となり、参照が容易となる。 FIG. 11 shows one set corresponding to one main memory address, and the main memory address is managed for each piece of contention information. That is, the conflict information includes a conflict presence flag, an access request type, and identification information of the access request in units of main memory addresses. As a result, address conflicts can be appropriately managed, and reference is facilitated.

図１３は、本発明の実施の形態２にかかるコヒーレンシ制御部におけるＥＢＲ又はＳＢＲ受信時のアドレス競合制御部１４７及び１４８の内部情報の遷移を示す表である。以下では、アドレス競合制御部１４７における動作として説明する。図１３の１３０１は、ＥＢＲを受け付けたときのものである。このとき、アドレス競合制御部１４７は、ＥＢＲのリクエストソースに対応するＥＢＲ競合検出フラグのビットを１にセットし、同時にＥＢＲのリクエストＩＤを対応するリクエストＩＤフィールドにセットする。また、図１３の１３０２は、ＳＢＲを受け付けたときのものである。このとき、アドレス競合制御部１４７は、ＳＢＲのリクエストソースに対応するＳＢＲ競合検出フラグのビットを１にセットし、同時にＳＢＲのリクエストＩＤを対応するリクエストＩＤフィールドにセットする。 FIG. 13 is a table showing the transition of internal information of the address conflict control units 147 and 148 at the time of EBR or SBR reception in the coherency control unit according to the second embodiment of the present invention. Hereinafter, the operation in the address conflict control unit 147 will be described. Reference numeral 1301 in FIG. 13 is the one when the EBR is accepted. At this time, the address conflict control unit 147 sets the bit of the EBR conflict detection flag corresponding to the EBR request source to 1, and simultaneously sets the request ID field of the EBR to the corresponding request ID field. Reference numeral 1302 in FIG. 13 is the one when the SBR is accepted. At this time, the address conflict control unit 147 sets the bit of the SBR conflict detection flag corresponding to the SBR request source to 1, and simultaneously sets the request ID field of the SBR to the corresponding request ID field.

図１４は、本発明の実施の形態２にかかるコヒーレンシ制御部におけるレスポンス生成を説明する表である。図１４は、コヒーレンシ制御部がＥＢＲ／ＳＢＲに対するリプライをリクエストソースのプロセッサに対して発行する際に、どのレスポンスを行うかの組み合わせを示す。つまり、コヒーレンシ制御部は、リクエスト種別と更新前のアドレス競合制御部が保持する競合情報のＶａｌｉｄビットの組み合わせに応じて、通常のレスポンス（Ｒｓｐ＿Ｄａｔａ＿Ｅ又はＲｓｐ＿Ｄａｔａ＿Ｓ）を返却するか、又は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを生成しレスポンスとするかを決定する。 FIG. 14 is a table illustrating response generation in the coherency control unit according to the second embodiment of the present invention. FIG. 14 shows combinations of responses to be made when the coherency control unit issues a reply to EBR / SBR to the request source processor. That is, the coherency control unit returns a normal response (Rsp_Data_E or Rsp_Data_S) or generates Rsp_Data_Cnflt according to the combination of the request type and the valid bit of the conflict information held by the address conflict control unit before update. Decide whether to make a response.

図１５は、本発明の実施の形態２にかかるレスポンスであるＲｓｐ＿Ｄａｔａ＿ＥおよびＲｓｐ＿Ｄａｔａ＿Ｓの内部構成を示すものである。図１５に示すように、Ｒｓｐ＿Ｄａｔａ＿ＥおよびＲｓｐ＿Ｄａｔａ＿Ｓは、コマンドコード、デスティネーションＩＤ、オリジナルリクエストＩＤ及びデータを備える。尚、Ｒｓｐ＿Ｄａｔａ＿ＥおよびＲｓｐ＿Ｄａｔａ＿Ｓの内部構成は、公知のものを用いても構わない。また、図１６は、本発明の実施の形態２にかかるレスポンスであるＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔの内部構成を示すものである。図１６に示すように、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔは、図１５の構成に加え、ＥＢＲ競合検出フラグ、ＳＢＲ競合検出フラグ、リクエストＩＤを備える。 FIG. 15 shows an internal configuration of Rsp_Data_E and Rsp_Data_S which are responses according to the second embodiment of the present invention. As shown in FIG. 15, Rsp_Data_E and Rsp_Data_S include a command code, a destination ID, an original request ID, and data. In addition, a well-known thing may be used for the internal structure of Rsp_Data_E and Rsp_Data_S. FIG. 16 shows the internal structure of Rsp_Data_Cnflt, which is a response according to the second embodiment of the present invention. As shown in FIG. 16, Rsp_Data_Cnflt includes an EBR conflict detection flag, an SBR conflict detection flag, and a request ID in addition to the configuration of FIG.

図１７は、本発明の実施の形態２にかかるアドレス競合時のコヒーレンシ制御部の処理の流れを示すフローチャート図である。尚、以下の説明で扱う主記憶の対象アドレスは、全て同一のものであるとする。また、対象アドレスは、主記憶１５３のアドレスであるものとする。また、初期状態として、プロセッサ１０１乃至１０８のキャッシュメモリ１１１乃至１１８のいずれにも、当該対象アドレスに対応する主記憶１５３のデータがキャッシュされていないものとする。 FIG. 17 is a flowchart showing a processing flow of the coherency control unit at the time of address conflict according to the second embodiment of the present invention. It is assumed that the main memory target addresses treated in the following description are the same. The target address is assumed to be the address of the main memory 153. Further, as an initial state, it is assumed that the data in the main memory 153 corresponding to the target address is not cached in any of the cache memories 111 to 118 of the processors 101 to 108.

まず、プロセッサ１０１、１０３及び１０６がＥＢＲを、プロセッサ１０４及び１０７がＳＢＲを同時期に発行したものとする。このとき、ノードコントローラ１２１及び１２２は、複数のプロセッサから競合するアクセス要求を受信する（Ｓ２１）。具体的には、レスポンス／スヌープ制御部１３１、１３３及び１３６は、プロセッサ１０１、１０３及び１０６からのアクセス要求であるＥＢＲを受信する。同様に、レスポンス／スヌープ制御部１３４及び１３７は、プロセッサ１０４及び１０７からのアクセス要求であるＳＢＲを受信する。そして、レスポンス／スヌープ制御部１３１、１３３及び１３４は、受信したアクセス要求をクロスバー１４１へ送信する。また、レスポンス／スヌープ制御部１３６及び１３７は、受信したアクセス要求をクロスバー１４２へ送信する。ここで、受信されたアクセス要求の対象アドレスは、全て主記憶１５３のアドレスであるため、クロスバー１４２は、クロスバー１４１へ受信したアクセス要求を送信する。その後、クロスバー１４１は、受信したアクセス要求をシリアライズし、インタフェース１８１を介してコヒーレンシ制御部１４３へ送信する。ここでは、コヒーレンシ制御部１４３は、プロセッサ１０１からのＥＢＲ、プロセッサ１０３からのＥＢＲ、プロセッサ１０４からのＳＢＲ、プロセッサ１０６からのＥＢＲ及びプロセッサ１０７からのＳＢＲの順序で受信したものとする。 First, it is assumed that the processors 101, 103, and 106 issued EBR and the processors 104 and 107 issued SBR at the same time. At this time, the node controllers 121 and 122 receive conflicting access requests from a plurality of processors (S21). Specifically, the response / snoop control units 131, 133, and 136 receive EBR that is an access request from the processors 101, 103, and 106. Similarly, the response / snoop control units 134 and 137 receive SBR that is an access request from the processors 104 and 107. Then, the response / snoop control units 131, 133, and 134 transmit the received access request to the crossbar 141. In addition, the response / snoop control units 136 and 137 transmit the received access request to the crossbar 142. Here, since all the target addresses of the received access request are addresses of the main memory 153, the crossbar 142 transmits the received access request to the crossbar 141. Thereafter, the crossbar 141 serializes the received access request and transmits it to the coherency control unit 143 via the interface 181. Here, it is assumed that the coherency control unit 143 receives the EBR from the processor 101, the EBR from the processor 103, the SBR from the processor 104, the EBR from the processor 106, and the SBR from the processor 107.

次に、コヒーレンシ制御部１４３は、競合情報を登録する（Ｓ２２）。具体的には、コヒーレンシ制御部１４３は、受信した複数のＥＢＲ及びＳＢＲにより、アドレス競合制御部に格納された競合情報を図１３に示したように更新する。すなわち、コヒーレンシ制御部は、複数のアクセス要求を受信した場合に、複数のアクセス要求を行ったプロセッサと、当該アクセス要求の種別とを対応付けて競合情報として競合情報記憶手段に格納する。これにより、競合情報を適切に登録し、参照が容易となる。 Next, the coherency control unit 143 registers competition information (S22). Specifically, the coherency control unit 143 updates the conflict information stored in the address conflict control unit as shown in FIG. 13 by using the plurality of received EBRs and SBRs. That is, when a plurality of access requests are received, the coherency control unit associates the processor that has made the plurality of access requests with the type of the access request, and stores them in the conflict information storage unit as the conflict information. As a result, the competition information is appropriately registered and can be referred to easily.

ここでは、コヒーレンシ制御部１４３が全アクセス要求を受信した後、競合情報は、Ｖａｌｉｄ＝１、ＥＢＲ競合検出フラグ（７：０）＝０ｘ２５、ＳＢＲ競合検出フラグ（７：０）＝０ｘ４８、そしてリクエストＩＤ＿０、リクエストＩＤ＿２、リクエストＩＤ＿３、リクエストＩＤ＿５、リクエストＩＤ＿６には、それぞれプロセッサ１０１からのＥＢＲのリクエストＩＤ、プロセッサ１０３からのＥＢＲのリクエストＩＤ、プロセッサ１０４からのＳＢＲリクエストＩＤ、プロセッサ１０６からのＥＢＲリクエストＩＤ、プロセッサ１０７からのＳＢＲリクエストＩＤがセットされる。 Here, after the coherency control unit 143 receives all access requests, the contention information includes Valid = 1, EBR contention detection flag (7: 0) = 0x25, SBR contention detection flag (7: 0) = 0x48, and request ID_0, request ID_2, request ID_3, request ID_5, and request ID_6 include an EBR request ID from the processor 101, an EBR request ID from the processor 103, an SBR request ID from the processor 104, and an EBR request ID from the processor 106, respectively. , The SBR request ID from the processor 107 is set.

続いて、コヒーレンシ制御部１４３は、主記憶１５３から対象アドレスのデータを読み出す（Ｓ２３）。具体的には、コヒーレンシ制御部１４３は、この状態でメモリインタフェース１８３を介してＭＥＭ＿ＲＱ＿Ｒを主記憶管理システム１５１に送付し、主記憶管理システム１５１からＭＥＭ＿ＲＳＰ＿Ｄを受信する。 Subsequently, the coherency control unit 143 reads the data of the target address from the main memory 153 (S23). Specifically, the coherency control unit 143 sends MEM_RQ_R to the main memory management system 151 via the memory interface 183 in this state, and receives MEM_RSP_D from the main memory management system 151.

その後、コヒーレンシ制御部１４３は、対象アドレスのＶａｌｉｄビットが１であるか否かを判定する（Ｓ２４）。そして、対象アドレスのＶａｌｉｄビットが１であると判定された場合、コヒーレンシ制御部１４３は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔ生成する（Ｓ２５）。すなわち、コヒーレンシ制御部は、アドレス競合制御部１４７を参照し、複数のアクセス要求における対象アドレスが競合するか否かを判定し、対象アドレスが競合すると判定した場合に、応答指示をクロスバー１４１へ送信する。 Thereafter, the coherency control unit 143 determines whether or not the Valid bit of the target address is 1 (S24). If it is determined that the Valid bit of the target address is 1, the coherency control unit 143 generates Rsp_Data_Cnflt (S25). That is, the coherency control unit refers to the address conflict control unit 147 to determine whether or not the target address in the plurality of access requests conflicts, and when it is determined that the target address conflicts, the response instruction is sent to the crossbar 141. Send.

ここでは、アドレス競合制御部１４７に格納された競合情報のＶａｌｉｄビットが１であるから図１４の１４０２の動作となる。尚、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔの生成する処理は、図１８にて後述する。そして、コヒーレンシ制御部１４３は、ディレクトリ更新する（Ｓ２７）。すなわち、コヒーレンシ制御部１４３は、ステップＳ２５において決定された最初の返信対象のプロセッサに対し、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを生成し、このＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔの処理が完了した場合の最終形として適切な状態にキャッシュステータス管理機能１４５内のディレクトリを更新する。このとき、ディレクトリのステータスはＳとなり、キャッシングエージェント情報は、プロセッサ１０４と１０７がキャッシュすることを示す０ｘ４８となる。 Here, since the Valid bit of the conflict information stored in the address conflict control unit 147 is 1, the operation of 1402 in FIG. 14 is performed. The process of generating Rsp_Data_Cnflt will be described later with reference to FIG. Then, the coherency control unit 143 updates the directory (S27). That is, the coherency control unit 143 generates Rsp_Data_Cnflt for the first reply target processor determined in step S25, and sets the cache status management function 145 to an appropriate state as the final form when the processing of this Rsp_Data_Cnflt is completed. Update the directory. At this time, the directory status is S, and the caching agent information is 0x48 indicating that the processors 104 and 107 cache.

また、ステップＳ２４において、対象アドレスのＶａｌｉｄビットが１でないと判定された場合、コヒーレンシ制御部１４３は、Ｒｓｐ＿Ｄａｔａ＿Ｅ又はＲｓｐ＿Ｄａｔａ＿Ｓを発行する（Ｓ２６）。その後、同様にステップＳ２７を実行する。 If it is determined in step S24 that the Valid bit of the target address is not 1, the coherency control unit 143 issues Rsp_Data_E or Rsp_Data_S (S26). Thereafter, step S27 is similarly executed.

図１８は、本発明の実施の形態２にかかるコヒーレンシ制御部がレスポンスであるＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを生成する処理の流れを示すフローチャート図である。 FIG. 18 is a flowchart showing a flow of processing in which the coherency control unit according to the second embodiment of the present invention generates Rsp_Data_Cnflt as a response.

まず、コヒーレンシ制御部１４３は、アドレス競合制御部１４７に格納された競合情報を参照し、ＥＢＲ競合検出フラグが全て０であるか否かを判定する（Ｓ３１）。ＥＢＲ競合検出フラグが全て０であると判定した場合、コヒーレンシ制御部１４３は、ＳＢＲ競合検出フラグが１であるビットの内、最も小さなビットに対応するプロセッサを最初の返信対象プロセッサと決定する（Ｓ３２）。また、ＥＢＲ競合検出フラグの全てが０ではないと判定した場合、コヒーレンシ制御部１４３は、ＥＢＲ競合検出フラグが１であるビットの内、最も小さなビットに対応するプロセッサを最初の返信対象プロセッサと決定する（Ｓ３３）。ここでは、ＥＢＲ競合検出フラグ＝０ｘ２５であるため、ビット０に対応するプロセッサ１０１が最初の返信対象として決定される。なお、上述した返信対象のプロセッサの決定方法は、これに限定されない。すなわち、返信対象のプロセッサは、任意のアルゴリズムにより、最終的に全てのプロセッサが選択される方法であればよい。言い換えると、返信対象のプロセッサは、複数のプロセッサにおける所定の順序に基づき、決定される。所定の順序とは、例えば、任意のアルゴリズムである。例えば、コヒーレンシ制御部１４３において、最もレイテンシの短いプロセッサが決定されるようなアルゴリズムを最要することが望ましい。これにより、最も遅延の短いプロセッサを選択することができる。尚、ここでは、説明の簡略化のため、単純に小さいビットに対応するプロセッサを優先して決定する方法を挙げた。 First, the coherency control unit 143 refers to the conflict information stored in the address conflict control unit 147 and determines whether or not all the EBR conflict detection flags are 0 (S31). When it is determined that the EBR contention detection flags are all 0, the coherency control unit 143 determines the processor corresponding to the smallest bit among the bits whose SBR contention detection flag is 1 as the first return target processor (S32). ). If it is determined that all of the EBR conflict detection flags are not 0, the coherency control unit 143 determines the processor corresponding to the smallest bit among the bits whose EBR conflict detection flag is 1 as the first return target processor. (S33). Here, since the EBR conflict detection flag = 0x25, the processor 101 corresponding to bit 0 is determined as the first reply target. Note that the above-described method of determining the processor to be returned is not limited to this. That is, the return target processor may be a method that finally selects all the processors by an arbitrary algorithm. In other words, the return target processor is determined based on a predetermined order in the plurality of processors. The predetermined order is, for example, an arbitrary algorithm. For example, it is desirable that the coherency control unit 143 requires an algorithm that determines the processor with the shortest latency. Thereby, the processor with the shortest delay can be selected. Here, for the sake of simplification of description, a method of simply giving priority to a processor corresponding to a small bit has been described.

その後、コヒーレンシ制御部１４３は、決定した最初の返信対象プロセッサを示すＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのデスティネーションＩＤフィールドに設定する（Ｓ３４）。これにより、クロスバー１４１及び１４２は、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのデスティネーションＩＤフィールドを参照することにより、適切にルーティングすることができる。そして、コヒーレンシ制御部１４３は、決定した最初の返信対象プロセッサのリクエストＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのオリジナルリクエストＩＤフィールドに設定する（Ｓ３５）。これにより、返信における最終到着点であるプロセッサは、当該返信が自己の発行したどのリクエストに対するものであるか認識することができる。続いて、コヒーレンシ制御部１４３は、アドレス競合制御部の全ＥＢＲ競合検出フラグ、全ＳＢＲ競合検出フラグ及び全リクエストＩＤ情報をＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのＥＢＲ競合検出フラグフィールド、ＳＢＲ競合検出フラグフィールド及びリクエストＩＤフィールドに設定する（Ｓ３６）。 Thereafter, the coherency control unit 143 sets an ID indicating the determined first reply target processor in the destination ID field of Rsp_Data_Cnflt (S34). Thus, the crossbars 141 and 142 can appropriately route by referring to the destination ID field of Rsp_Data_Cnflt. Then, the coherency control unit 143 sets the determined request ID of the first reply target processor in the original request ID field of Rsp_Data_Cnflt (S35). Thus, the processor that is the final arrival point in the reply can recognize to which request the reply is issued. Subsequently, the coherency control unit 143 sets all the EBR conflict detection flag, all SBR conflict detection flag, and all request ID information of the address conflict control unit in the EBR conflict detection flag field, the SBR conflict detection flag field, and the request ID field of Rsp_Data_Cnflt. (S36).

最後に、コヒーレンシ制御部１４３は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを発行し、決定した最初の返信対象プロセッサであるプロセッサ１０１が接続されたレスポンス／スヌープ制御部１３１へ送信する（Ｓ３７）。つまり、コヒーレンシ制御部は、複数のアクセス要求を行ったプロセッサと、当該アクセス要求の種別とを対応付けて競合情報として応答指示であるＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔに含める。これにより、アドレス競合制御部に格納された競合情報内の各種フラグを設定する。そして、以降、レスポンス／スヌープ制御部において、コヒーレンシ制御部に戻すことなく次の返信対象のプロセッサを判定可能となるため、処理時間のさらなる短縮ができる。 Finally, the coherency control unit 143 issues Rsp_Data_Cnflt and transmits the Rsp_Data_Cnflt to the response / snoop control unit 131 to which the processor 101 as the determined first reply target processor is connected (S37). That is, the coherency control unit associates a processor that has made a plurality of access requests with the type of the access request, and includes it in Rsp_Data_Cnflt that is a response instruction as contention information. Thereby, various flags in the conflict information stored in the address conflict control unit are set. Then, since the response / snoop control unit can determine the next reply target processor without returning to the coherency control unit, the processing time can be further shortened.

図１９は、本発明の実施の形態２にかかるレスポンス／スヌープ制御部がＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを受信した場合における処理の前半の流れを示すフローチャート図である。また、図２０は、本発明の実施の形態２にかかるレスポンス／スヌープ制御部がＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを受信した場合における処理の後半の流れを示すフローチャート図である。すなわち、図１９及び図２０は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを受信したレスポンス／スヌープ制御部における、接続するプロセッサへのレスポンス返却法、スヌープリクエスト発行法、及び新たなＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔ生成法、Ｃｍｐ＿ＩとWBの生成法を示すフローチャートである。 FIG. 19 is a flowchart showing the first half of the process when the response / snoop control unit according to the second embodiment of the present invention receives Rsp_Data_Cnflt. FIG. 20 is a flowchart showing the latter half of the process when the response / snoop control unit according to the second embodiment of the present invention receives Rsp_Data_Cnflt. 19 and 20 are flowcharts showing a response return method, a snoop request issue method, a new Rsp_Data_Cnflt generation method, and a Cmp_I and WB generation method in the response / snoop control unit that has received Rsp_Data_Cnflt. It is.

まず、レスポンス／スヌープ制御部１３１は、コヒーレンシ制御部１４３からクロスバー１４１を介して、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを受信する（Ｓ４１）。次に、レスポンス／スヌープ制御部１３１は、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのＥＢＲ競合検出フラグの中の、自レスポンス/スヌープ制御部の接続するプロセッサに対応するビットが１であるか否かを判定する（Ｓ４２）。具体的には、レスポンス／スヌープ制御部１３１は、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのＥＢＲ競合検出フラグ及びＳＢＲ競合検出フラグの中のプロセッサ１０１に対応するビットを参照し、当該Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔがプロセッサ１０１の発行したＥＢＲに対応するレスポンスなのか、ＳＢＲに対応するレスポンスなのかを識別する。ここでは、ＥＢＲ競合検出フラグのビット０が１であるから、レスポンス／スヌープ制御部１３１は、ＥＢＲに対するレスポンスと識別する。 First, the response / snoop control unit 131 receives Rsp_Data_Cnflt from the coherency control unit 143 via the crossbar 141 (S41). Next, the response / snoop control unit 131 determines whether or not the bit corresponding to the processor connected to the own response / snoop control unit in the EBR conflict detection flag of Rsp_Data_Cnflt is 1 (S42). Specifically, the response / snoop control unit 131 refers to the bit corresponding to the processor 101 in the EBR conflict detection flag and the SBR conflict detection flag of Rsp_Data_Cnflt, and the Rsp_Data_Cnflt responds to the EBR issued by the processor 101. Or a response corresponding to SBR. Here, since bit 0 of the EBR conflict detection flag is 1, the response / snoop control unit 131 identifies it as a response to EBR.

ステップＳ４２において、自レスポンス／スヌープ制御部の接続するプロセッサに対応するビットが１であると判定された場合、レスポンス／スヌープ制御部１３１は、自己の接続するプロセッサ１０１に対し、Ｒｓｐ＿Ｄａｔａ＿Ｅを返信する（Ｓ４３）。すなわち、レスポンス／スヌープ制御部１３１は、受信したＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔをプロセッサ１０１が処理可能なＲｓｐ＿Ｄａｔａ＿Ｅに変換すると言える。これにより、上述したとおり、本発明の実施の形態２にかかるプロセッサ１０１乃至１０８は、本発明のための特別なプロセッサではなく、一般的なものを使用可能とすることができる。 If it is determined in step S42 that the bit corresponding to the processor to which the own response / snoop control unit is connected is 1, the response / snoop control unit 131 returns Rsp_Data_E to the processor 101 to which it is connected ( S43). That is, it can be said that the response / snoop control unit 131 converts the received Rsp_Data_Cnflt into Rsp_Data_E that can be processed by the processor 101. As a result, as described above, the processors 101 to 108 according to the second embodiment of the present invention are not a special processor for the present invention, and a general processor can be used.

引き続き、レスポンス／スヌープ制御部１３１は、プロセッサ１０１に対して、Ｓｎｐ＿Ｅを送信する（Ｓ４４）。 Subsequently, the response / snoop control unit 131 transmits Snp_E to the processor 101 (S44).

尚、上述した課題で挙げたＳＭＰシステムに適用した場合、このとき、レスポンス／スヌープ制御部１３１からコヒーレンシ制御部１４３にＣｍｐを返却し、プロセッサ１０１へのレスポンスが完了したことを通知する。続いて、当該Ｃｍｐ受信を契機として、コヒーレンシ制御部１４３は、プロセッサ１０１のキャッシュメモリ１１１内データを吐き出させるために生成、発行するプロセッサ１０３のＥＢＲを要因とするＳｎｐ＿Ｅが、プロセッサ１０１に届くのを待つ必要があった。すなわち、次に処理するリクエストの発行元であるプロセッサ１０３にデータを渡すためにプロセッサ１０１へＳｎｐ＿Ｅを発行するまでに、レスポンス／スヌープ制御部１３１からコヒーレンシ制御部１４３までの間をＣｍｐとＳｎｐ＿Ｅが往復するレイテンシだけの時間が必要であった。 When applied to the SMP system mentioned in the above-mentioned problem, Cmp is returned from the response / snoop control unit 131 to the coherency control unit 143 to notify that the response to the processor 101 is completed. Subsequently, with the reception of the Cmp, the coherency control unit 143 receives the Snp_E that is generated and issued to cause the processor 103 to eject the data in the cache memory 111 of the processor 101 to the processor 101. I had to wait. That is, Cmp and Snp_E reciprocate between the response / snoop control unit 131 and the coherency control unit 143 until Snp_E is issued to the processor 101 in order to pass data to the processor 103 that is the issuer of the next request to be processed. It took time to do just that.

本発明の実施の形態２では、コヒーレンシ制御部１４３とレスポンス／スヌープ制御部１３１間を同一のノードコントローラまたは一段渡ったノードコントローラという構成にしている。これは、実際の大規模コンピュータでは、この間に多段のノードコントローラを介する必要がある場合があるためである。そして、その場合、レスポンス／スヌープ制御部１３１からコヒーレンシ制御部１４３の間のＣｍｐとＳｎｐ＿Ｅの往復時間は、さらに大きなものとなることがあった。 In the second embodiment of the present invention, the coherency control unit 143 and the response / snoop control unit 131 are configured as the same node controller or a node controller that is one step across. This is because an actual large-scale computer may require a multistage node controller during this period. In that case, the round trip time of Cmp and Snp_E between the response / snoop control unit 131 and the coherency control unit 143 may be even longer.

そこで、本発明の実施の形態２では、この往復によりプロセッサ１０１からプロセッサ１０１以外の他のプロセッサに対するレスポンスレイテンシを短縮することを目的としている。すなわち、プロセッサ１０１にレスポンスを行ったレスポンス／スヌープ制御部１３１は、コヒーレンシ制御部１４３にＣｍｐを返信し、その後のＳｎｐ＿Ｅ到着を待つことを行わない。その代りに、レスポンス／スヌープ制御部１３１は、ステップＳ４４において、自主的にＳｎｐ＿Ｅを生成し、プロセッサ１０１に発行する。そして、プロセッサ１０３では、この結果のＣｍｐ＿ＩとＲｓｐ＿Ｄａｔａ＿Ｅがプロセッサ１０１から返却されるのを待ち受ける。 Therefore, the second embodiment of the present invention aims to shorten the response latency from the processor 101 to other processors than the processor 101 by this reciprocation. That is, the response / snoop control unit 131 that has made a response to the processor 101 returns Cmp to the coherency control unit 143 and does not wait for subsequent Snp_E arrival. Instead, the response / snoop control unit 131 voluntarily generates Snp_E and issues it to the processor 101 in step S44. Then, the processor 103 waits for Cmp_I and Rsp_Data_E as a result to be returned from the processor 101.

その後、レスポンス／スヌープ制御部１３１は、Ｓｎｐ＿Ｅに対するＣｍｐ＿Ｉ+Ｒｓｐ＿Ｄａｔａ＿Ｅをプロセッサから受信したか否かを判定する（Ｓ４５）。受信しないと判定した場合、再度、ステップＳ４５を実行する。そして、受信したと判定した場合、レスポンス／スヌープ制御部１３１は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔのデータ部をＲｓｐ＿Ｄａｔａ＿Ｅのデータと差し替える（Ｓ４６）。すなわち、データの最新化を行う。続いて、レスポンス／スヌープ制御部１３１は、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのＥＢＲ競合検出フラグの中の自レスポンス／スヌープ管理部の接続するプロセッサ１０１に対応するビットを０にリセットする（Ｓ４７）。 Thereafter, the response / snoop control unit 131 determines whether or not Cmp_I + Rsp_Data_E for Snp_E is received from the processor (S45). If it is determined not to be received, step S45 is executed again. If it is determined that the data has been received, the response / snoop control unit 131 replaces the data part of Rsp_Data_Cnflt with the data of Rsp_Data_E (S46). That is, the data is updated. Subsequently, the response / snoop control unit 131 resets the bit corresponding to the processor 101 connected to the own response / snoop management unit in the EBR conflict detection flag of Rsp_Data_Cnflt to 0 (S47).

続いて、図２０に進み、レスポンス／スヌープ制御部１３１は、ＥＢＲ競合検出フラグが全て０であるか否かを判定する（Ｓ５１）。ここで、レスポンス／スヌープ制御部１３１は、ステップＳ４７において、ＥＢＲ競合検出フラグを０ｘ２５から０ｘ２４に更新した直後である。よって、全て０ではないため、ステップＳ５１において、ＥＢＲ競合検出フラグの全てが０ではないと判定される。このとき、レスポンス／スヌープ制御部１３１は、ＥＢＲ競合検出フラグが１であるビットの内、最も小さなビットに対応するプロセッサを次の返信対象プロセッサと決定する（Ｓ５６）。ここでは、ＥＢＲ競合検出フラグが０ｘ２４であるからプロセッサ１０３を次の返信対象プロセッサと決定する。 Subsequently, proceeding to FIG. 20, the response / snoop control unit 131 determines whether or not all the EBR conflict detection flags are 0 (S51). Here, the response / snoop control unit 131 immediately after updating the EBR conflict detection flag from 0x25 to 0x24 in step S47. Therefore, since all are not 0, it is determined in step S51 that all of the EBR conflict detection flags are not 0. At this time, the response / snoop control unit 131 determines the processor corresponding to the smallest bit among the bits whose EBR conflict detection flag is 1 as the next reply target processor (S56). Here, since the EBR conflict detection flag is 0x24, the processor 103 is determined as the next reply target processor.

その後、レスポンス／スヌープ制御部１３１は、決定した次の返信対象プロセッサを示すＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのデスティネーションＩＤフィールドに設定する（Ｓ５７）。具体的には、レスポンス／スヌープ制御部１３１は、次の返信対象プロセッサと決定されたプロセッサ１０３に対応するＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのデスティネーションＩＤフィールドに設定する。 Thereafter, the response / snoop control unit 131 sets the ID indicating the determined next reply target processor in the destination ID field of Rsp_Data_Cnflt (S57). Specifically, the response / snoop control unit 131 sets the ID corresponding to the processor 103 determined as the next reply target processor in the destination ID field of Rsp_Data_Cnflt.

また、レスポンス／スヌープ制御部１３１は、決定した次の返信対象プロセッサのリクエストＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのオリジナルリクエストＩＤフィールドに設定する（Ｓ５８）。具体的には、レスポンス／スヌープ制御部１３１は、プロセッサ１０３の発行したＥＢＲのリクエストＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのリクエストＩＤ＿２から読み出し、オリジナルリクエストＩＤフィールドにセットする。 Further, the response / snoop control unit 131 sets the determined request ID of the next reply target processor in the original request ID field of Rsp_Data_Cnflt (S58). Specifically, the response / snoop control unit 131 reads the EBR request ID issued by the processor 103 from the request ID_2 of Rsp_Data_Cnflt, and sets it in the original request ID field.

さらに、レスポンス／スヌープ制御部１３１は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを発行し、決定した次の返信対象プロセッサであるプロセッサ１０３が接続されたレスポンス／スヌープ制御部１３３へ送信する（Ｓ５９）。具体的には、レスポンス／スヌープ制御部１３１は、クロスバー１４１へＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを送信し、レスポンス／スヌープ制御部１３１は、レスポンス／スヌープ制御部１３３へ当該Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを送信する。以後、レスポンス／スヌープ制御部１３３は、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのデスティネーションＩＤフィールド情報に従い、適切にルーティングし、プロセッサ１０３へ送信する。 Further, the response / snoop control unit 131 issues Rsp_Data_Cnflt and transmits it to the response / snoop control unit 133 to which the processor 103 which is the next reply target processor is connected (S59). Specifically, the response / snoop control unit 131 transmits Rsp_Data_Cnflt to the crossbar 141, and the response / snoop control unit 131 transmits the Rsp_Data_Cnflt to the response / snoop control unit 133. Thereafter, the response / snoop control unit 133 routes appropriately according to the destination ID field information of Rsp_Data_Cnflt, and transmits it to the processor 103.

以降、同様にＥＢＲのリクエストソースであるプロセッサ１０３及び１０６に接続するレスポンス／スヌープ制御部１３３及び１３６においても、レスポンス／スヌープ制御部１３１同様に、図１９及び図２０に従った処理が行われる。 Thereafter, similarly to the response / snoop control unit 131, the processing according to FIGS. 19 and 20 is also performed in the response / snoop control units 133 and 136 connected to the processors 103 and 106 which are EBR request sources.

ただし、最後のＥＢＲソースであるプロセッサ１０６に接続するレスポンス／スヌープ制御部１３６においては、ステップＳ４７までは、レスポンス／スヌープ制御部１３１及び１３３と同様であるが、ステップＳ５１において、ＥＢＲ競合検出フラグが全て０と判定される。そこで、レスポンス／スヌープ制御部１３６は、ＥＢＲ競合検出フラグが全て０であると判定された場合、ＳＢＲ競合検出フラグが全て０であるか否かを判定する（Ｓ５２）。 However, the response / snoop control unit 136 connected to the last EBR source processor 106 is the same as the response / snoop control units 131 and 133 until step S47, but in step S51, the EBR conflict detection flag is set. All are determined to be 0. Therefore, when it is determined that the EBR conflict detection flag is all 0, the response / snoop control unit 136 determines whether the SBR conflict detection flag is all 0 (S52).

ここでは、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのＳＢＲ競合検出フラグは０ｘ４８であるから、レスポンス／スヌープ制御部１３６は、ＳＢＲ競合検出フラグの全てが０ではないと判定する。その後、レスポンス／スヌープ制御部１３６は、ＳＢＲ競合検出フラグが１であるビットの内、最も小さなビットに対応するプロセッサであるプロセッサ１０４を次の返信対象プロセッサとして決定する（Ｓ５５）。 Here, since the SBR conflict detection flag of Rsp_Data_Cnflt is 0x48, the response / snoop control unit 136 determines that all of the SBR conflict detection flags are not zero. Thereafter, the response / snoop control unit 136 determines the processor 104 that is the processor corresponding to the smallest bit among the bits whose SBR conflict detection flag is 1 as the next reply target processor (S55).

その後、レスポンス／スヌープ制御部１３６は、プロセッサ１０４を意味するＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのデスティネーションＩＤに設定（Ｓ５７）し、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのリクエストＩＤ＿３の値を抜き出し、オリジナルリクエストＩＤに設定する（Ｓ５８）。そして、レスポンス／スヌープ制御部１３６は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを発行し、決定した次の返信対象プロセッサであるプロセッサ１０４が接続されたレスポンス／スヌープ制御部１３４へ送信する（Ｓ５９）。 Thereafter, the response / snoop control unit 136 sets the ID meaning the processor 104 as the destination ID of Rsp_Data_Cnflt (S57), extracts the value of the request ID_3 of Rsp_Data_Cnflt, and sets it as the original request ID (S58). Then, the response / snoop control unit 136 issues Rsp_Data_Cnflt and transmits it to the response / snoop control unit 134 to which the processor 104 that is the next return target processor is connected (S59).

ここで、レスポンス／スヌープ制御部１３４は、図１９及び図２０に従い、処理する。まず、レスポンス／スヌープ制御部１３４は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを受信（Ｓ４１）し、ＥＢＲ競合検出フラグのプロセッサ１０４に対応するビットが１であるか否かを判定する（Ｓ４２）。ここで、ＥＢＲ競合検出フラグは全て０であるから、自レスポンス／スヌープ制御部の接続するプロセッサに対応するビットが１であると判定されない。そのため、レスポンス／スヌープ制御部１３４は、自レスポンス／スヌープ制御部の接続するプロセッサに対し、Ｒｓｐ＿Ｄａｔａ＿Ｓを返信する（Ｓ４８）。すなわち、レスポンス／スヌープ制御部１３４は、受信したＲｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔをプロセッサ１０４が処理可能なＲｓｐ＿Ｄａｔａ＿Ｓに変換すると言える。 Here, the response / snoop control unit 134 performs processing according to FIGS. 19 and 20. First, the response / snoop control unit 134 receives Rsp_Data_Cnflt (S41), and determines whether or not the bit corresponding to the processor 104 of the EBR conflict detection flag is 1 (S42). Here, since all the EBR conflict detection flags are 0, it is not determined that the bit corresponding to the processor to which the own response / snoop control unit is connected is 1. Therefore, the response / snoop control unit 134 returns Rsp_Data_S to the processor to which the own response / snoop control unit is connected (S48). That is, it can be said that the response / snoop control unit 134 converts the received Rsp_Data_Cnflt into Rsp_Data_S that can be processed by the processor 104.

続いて、レスポンス／スヌープ制御部１３４は、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのＳＢＲ競合検出フラグの中の自レスポンス／スヌープ管理部の接続するプロセッサに対応するビットを０にリセットする（Ｓ４９）。具体的には、レスポンス／スヌープ制御部１３４は、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのＳＢＲ競合検出フラグの中のプロセッサ１０４に対応するビットを０リセットする。ここでは、ＳＢＲ競合検出フラグが０ｘ４８から０ｘ４０に遷移する。 Subsequently, the response / snoop control unit 134 resets the bit corresponding to the processor connected to the own response / snoop management unit in the SBR conflict detection flag of Rsp_Data_Cnflt to 0 (S49). Specifically, the response / snoop control unit 134 resets the bit corresponding to the processor 104 in the SBR conflict detection flag of Rsp_Data_Cnflt to zero. Here, the SBR competition detection flag transitions from 0x48 to 0x40.

続いて、図２０へ進み、レスポンス／スヌープ制御部１３４は、ＥＢＲ競合検出フラグが全て０であるか否かを判定する（Ｓ５１）。ここでは、ＥＢＲ競合検出フラグが全て０であると判定され、レスポンス／スヌープ制御部１３４は、ＳＢＲ競合検出フラグが全て０であるか否かを判定する（Ｓ５２）。ここでは、ＳＢＲ競合検出フラグが０ｘ４０であるため、ＳＢＲ競合検出フラグの全てが０ではないと判定される。そして、レスポンス／スヌープ制御部１３４は、ＥＢＲ競合検出フラグが０ｘ２４であるからプロセッサ１０７を次の返信対象プロセッサと決定する（Ｓ５５）。 Subsequently, proceeding to FIG. 20, the response / snoop control unit 134 determines whether or not all the EBR conflict detection flags are 0 (S51). Here, it is determined that the EBR conflict detection flags are all 0, and the response / snoop control unit 134 determines whether the SBR conflict detection flags are all 0 (S52). Here, since the SBR conflict detection flag is 0x40, it is determined that all of the SBR conflict detection flags are not zero. Then, since the EBR conflict detection flag is 0x24, the response / snoop control unit 134 determines the processor 107 as the next reply target processor (S55).

続いて、レスポンス／スヌープ制御部１３４は、プロセッサ１０７を意味するＩＤをＲｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのデスティネーションＩＤに設定（Ｓ５７）し、Ｒｓｐ＿Ｄａｔａ＿ＣｎｆｌｔのリクエストＩＤ＿６の値を抜き出し、オリジナルリクエストＩＤに設定する（Ｓ５８）。そして、レスポンス／スヌープ制御部１３４は、Ｒｓｐ＿Ｄａｔａ＿Ｃｎｆｌｔを発行し、決定した次の返信対象プロセッサであるプロセッサ１０７が接続されたレスポンス／スヌープ制御部１３７へ送信する（Ｓ５９）。 Subsequently, the response / snoop control unit 134 sets the ID representing the processor 107 as the destination ID of Rsp_Data_Cnflt (S57), extracts the value of the request ID_6 of Rsp_Data_Cnflt, and sets it as the original request ID (S58). Then, the response / snoop control unit 134 issues Rsp_Data_Cnflt and transmits it to the response / snoop control unit 137 to which the processor 107 that is the next reply target processor is connected (S59).

レスポンス／スヌープ制御部１３７は、レスポンス／スヌープ制御部１３４同様、図１９及び図２０に従い処理を行う。ただし、ステップＳ４９において、プロセッサ１０７に対応するＳＢＲ競合検出フラグが０ｘ４０から０ｘ００に遷移するため、ステップＳ５２において、ＳＢＲ競合検出フラグが全て０と判定され、ステップＳ５３へ進む。 Similar to the response / snoop control unit 134, the response / snoop control unit 137 performs processing according to FIGS. However, since the SBR conflict detection flag corresponding to the processor 107 transitions from 0x40 to 0x00 in step S49, it is determined in step S52 that all the SBR conflict detection flags are 0, and the process proceeds to step S53.

そして、レスポンス／スヌープ制御部１３７は、ステップＳ５２において、ＳＢＲ競合検出フラグが全て０であると判定された場合、Ｃｍｐ＿ＩとＷＢを生成し、コヒーレンシ制御部へ返却する（Ｓ５３）。具体的には、レスポンス／スヌープ制御部１３７は、図１０の９２０同様に、Ｃｍｐ＿ＩとWBを生成し、これをコヒーレンシ制御部１４３へ発行する。 If it is determined in step S52 that all SBR conflict detection flags are 0, the response / snoop control unit 137 generates Cmp_I and WB and returns them to the coherency control unit (S53). Specifically, the response / snoop control unit 137 generates Cmp_I and WB, and issues them to the coherency control unit 143, similarly to 920 in FIG.

その後、コヒーレンシ制御部１４３は、アドレス競合制御部１４７に格納された競合情報のＶａｌｉｄビットを０にリセットする。また、コヒーレンシ制御部１４３は、ＷＢによりＭＥＭ＿ＲＱ＿Ｗを生成し、メモリインタフェース１８３を介して主記憶管理システム１５１に発行し、Ｃｍｐを発行する（Ｓ５４）。これにより、全ての処理を終了する。 Thereafter, the coherency control unit 143 resets the Valid bit of the conflict information stored in the address conflict control unit 147 to 0. Further, the coherency control unit 143 generates MEM_RQ_W by WB, issues it to the main storage management system 151 via the memory interface 183, and issues Cmp (S54). Thereby, all the processes are completed.

以上に示したとおり、本発明の実施の形態２では、課題に示したように、１つのプロセッサリクエストを処理するたびにスヌープリクエスト及びそのコンプリーションが対象プロセッサとキャッシュコヒーレンシ制御部を往復するのではなく、コンプリーションなしに次のプロセッサのリクエストを処理可能とすることで、前記往復のレイテンシをなくし、システムとしてのキャッシュコヒーレンシは保証しつつアドレス競合データがプロセッサ間を直接移動する制御を可能とすることにより、アドレス競合発生時の全リクエスト処理時間を軽減する。 As described above, in the second embodiment of the present invention, as shown in the problem, the snoop request and its completion do not reciprocate between the target processor and the cache coherency control unit every time one processor request is processed. In addition, by enabling processing of the next processor request without completion, the round-trip latency is eliminated, and it is possible to control the address conflict data to move directly between processors while guaranteeing cache coherency as a system. This reduces the processing time for all requests when an address conflict occurs.

＜その他の発明の実施の形態＞
さらに、本発明は上述した実施の形態のみに限定されるものではなく、既に述べた本発明の要旨を逸脱しない範囲において種々の変更が可能であることは勿論である。 <Other embodiments of the invention>
Furthermore, the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present invention described above.

１１プロセッサ
１２プロセッサ
１３キャッシュメモリ
１４キャッシュメモリ
１５キャッシュメモリ制御システム
１６送受信制御部
１７コヒーレンシ制御部
１８メインメモリ
１００マルチプロセッサシステム
１０１プロセッサ
１０２プロセッサ
１０３プロセッサ
１０４プロセッサ
１０５プロセッサ
１０６プロセッサ
１０７プロセッサ
１０８プロセッサ
１１１キャッシュメモリ
１１２キャッシュメモリ
１１３キャッシュメモリ
１１４キャッシュメモリ
１１５キャッシュメモリ
１１６キャッシュメモリ
１１７キャッシュメモリ
１１８キャッシュメモリ
１２１ノードコントローラ
１２２ノードコントローラ
１３１レスポンス／スヌープ制御部
１３２レスポンス／スヌープ制御部
１３３レスポンス／スヌープ制御部
１３４レスポンス／スヌープ制御部
１３５レスポンス／スヌープ制御部
１３６レスポンス／スヌープ制御部
１３７レスポンス／スヌープ制御部
１３８レスポンス／スヌープ制御部
１４１クロスバー
１４２クロスバー
１４３コヒーレンシ制御部
１４４コヒーレンシ制御部
１４５キャッシュステータス管理機能
１４６キャッシュステータス管理機能
１４７アドレス競合制御部
１４８アドレス競合制御部
１５１主記憶管理システム
１５２主記憶管理システム
１５３主記憶
１５４主記憶
１５５主記憶コントローラ
１５６主記憶コントローラ
１６１システムインタフェース
１６２システムインタフェース
１６３システムインタフェース
１６４システムインタフェース
１６５システムインタフェース
１６６システムインタフェース
１６７システムインタフェース
１６８システムインタフェース
１７１インタフェース
１７２インタフェース
１７３インタフェース
１７４インタフェース
１７５インタフェース
１７６インタフェース
１７７インタフェース
１７８インタフェース
１８１インタフェース
１８２インタフェース
１８３メモリインタフェース
１８４メモリインタフェース
１８５ノードインタフェース
３０１キャッシュステータス
３０２キャッシングエージェント情報
７０１Ｖａｌｉｄビット
７０２ＥＢＲ競合検出フラグ
７０３ＥＢＲ競合検出フラグ
７０４リクエストＩＤ 11 processor 12 processor 13 cache memory 14 cache memory 15 cache memory control system 16 transmission / reception control unit 17 coherency control unit 18 main memory 100 multiprocessor system 101 processor 102 processor 103 processor 104 processor 105 processor 106 processor 107 processor 108 processor 111 cache memory 112 Cache memory 113 Cache memory 114 Cache memory 115 Cache memory 116 Cache memory 117 Cache memory 118 Cache memory 121 Node controller 122 Node controller 131 Response / snoop control unit 132 Response / snoop control unit 133 Response / snoop Control unit 134 response / snoop control unit 135 response / snoop control unit 136 response / snoop control unit 137 response / snoop control unit 138 response / snoop control unit 141 crossbar 142 crossbar 143 coherency control unit 144 coherency control unit 145 cache status management Function 146 Cache status management function 147 Address conflict control unit 148 Address conflict control unit 151 Main memory management system 152 Main memory management system 153 Main memory 154 Main memory 155 Main memory controller 156 Main memory controller 161 System interface 162 System interface 163 System interface 164 System interface 165 System interface 166 System Interface 167 system interface 168 system interface 171 interface 172 interface 173 interface 174 interface 175 interface 176 interface 177 interface 178 interface 181 interface 182 interface 183 memory interface 184 memory interface 185 node interface 301 cache status 302 caching agent information 701 valid bit 702 EBR conflict detection Flag 703 EBR conflict detection flag 704 Request ID

Claims

A transmission / reception control unit connected to a plurality of processors each having a cache memory;
A coherency controller that accesses main memory and maintains consistency among cache memories of the plurality of processors,
The coherency control unit, when a target address in a plurality of access requests including at least exclusive data read from the plurality of processors to the main memory competes, contention information that is information on contention of the plurality of access requests Sending the included response instruction to the transmission / reception control unit,
The transmission / reception control unit sends data corresponding to the access request to a reply target processor determined from among the processors having the exclusive data read as an access request based on contention information included in the response instruction. A cache memory control system characterized by sending a reply and subsequently transmitting a snoop request for requesting acquisition of data in the cache memory of the processor to be sent back.

The transmission / reception control unit determines a processor that is a reply target next to the reply target processor based on contention information included in the response instruction, and receives a response to the snoop request from the reply target processor. 2. The cache memory control system according to claim 1, wherein data corresponding to the access request is returned to the processor to be returned next.

The coherency control unit determines a first reply target processor based on the contention information, and transmits the first reply target processor to the transmission / reception control unit as a response instruction to the determined reply target processor. The cache memory control system described in 1.

The coherency control unit associates the processor that has made the plurality of access requests with the type of the access request and includes the same as the contention information in the response instruction. The cache memory control system according to item.

Competing information storage means for storing the competing information,
The coherency control unit refers to the contention information storage unit, determines whether or not the target address in the plurality of access requests conflicts, and determines that the target address conflicts, the response instruction is transmitted to the transmission / reception control 5. The cache memory control system according to claim 1, wherein the cache memory control system transmits the data to a storage unit.

When the plurality of access requests are received, the coherency control unit associates the processor that has made the plurality of access requests with the type of the access request and stores them in the contention information storage unit as the contention information 6. The cache memory control system according to claim 5, wherein:

The cache memory control system according to claim 1, wherein the return target processor is determined based on a predetermined order in the plurality of processors.

8. The cache memory according to claim 1, wherein the contention information includes a contention presence / absence flag, an access request type, and identification information of the access request in an address unit of the main memory. Control system.

A transmission / reception control unit connected to a plurality of processors each having a cache memory;
A cache memory control method in a multiprocessor system, comprising: a coherency control unit that accesses a main memory and maintains consistency among cache memories of the plurality of processors;
In the transmission / reception control unit, a plurality of access requests including at least exclusive data read from the plurality of processors to the main memory are received,
In the coherency control unit, when target addresses in the plurality of access requests compete, a response instruction including contention information that is information related to the competition of the plurality of access requests is transmitted to the transmission / reception control unit,
In the transmission / reception control unit, based on contention information included in the response instruction, data corresponding to the access request is sent to the reply target processor determined from among the processors that have made the exclusive data read access request. Reply,
Subsequently, the transmission / reception control unit transmits a snoop request for requesting acquisition of data in the cache memory of the processor to be returned.

In the transmission / reception control unit, after determining a processor to be returned next to the processor to be returned based on contention information included in the response instruction, and after receiving a response to the snoop request from the processor to be returned The control method according to claim 9, wherein data corresponding to the access request is returned to the processor to be returned next.

The coherency control unit determines a first reply target processor based on the contention information, and transmits to the transmission / reception control unit as a response instruction to the determined reply target processor. The control method described in 1.

12. The coherency control unit according to any one of claims 9 to 11, wherein a processor that has made the plurality of access requests and the type of the access request are associated with each other and included in the response instruction as the contention information. The control method according to item.

The multiprocessor system includes contention information storage means for storing the contention information,
The coherency control unit refers to the contention information storage unit to determine whether or not the target address in the plurality of access requests conflicts, and when it is determined that the target address conflicts, the response instruction is transmitted to the transmission / reception control The control method according to claim 9, wherein the control method is transmitted to a unit.

In the coherency control unit, when the plurality of access requests are received, the processor that has made the plurality of access requests and the type of the access request are associated with each other and stored in the contention information storage unit as the contention information. The control method according to claim 13.

The control method according to claim 9, wherein the return target processor is determined based on a predetermined order in the plurality of processors.

16. The control method according to claim 9, wherein the contention information includes a contention presence / absence flag, an access request type, and identification information of the access request in an address unit of the main memory. .