EP3788494B1 - Transfer protocol in a data processing network - Google Patents
Transfer protocol in a data processing network Download PDFInfo
- Publication number
- EP3788494B1 EP3788494B1 EP19723174.9A EP19723174A EP3788494B1 EP 3788494 B1 EP3788494 B1 EP 3788494B1 EP 19723174 A EP19723174 A EP 19723174A EP 3788494 B1 EP3788494 B1 EP 3788494B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- data
- request
- node
- time
- snoop
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012545 processing Methods 0.000 title claims description 49
- 238000012546 transfer Methods 0.000 title claims description 8
- 238000000034 method Methods 0.000 claims description 39
- 230000001427 coherent effect Effects 0.000 claims description 24
- 230000004044 response Effects 0.000 claims description 17
- 230000009471 action Effects 0.000 claims description 11
- 230000001419 dependent effect Effects 0.000 claims description 7
- 239000000872 buffer Substances 0.000 claims description 3
- 230000003139 buffering effect Effects 0.000 claims 1
- 230000007246 mechanism Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 6
- 230000002093 peripheral effect Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/12—Arrangements for detecting or preventing errors in the information received by using return channel
- H04L1/16—Arrangements for detecting or preventing errors in the information received by using return channel in which the return channel carries supervisory signals, e.g. repetition request signals
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0817—Cache consistency protocols using directory methods
- G06F12/0828—Cache consistency protocols using directory methods with concurrent directory accessing, i.e. handling multiple concurrent coherency transactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
- G06F12/0833—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means in combination with broadcast means (e.g. for invalidation or updating)
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0813—Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1024—Latency reduction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L2001/0092—Error control systems characterised by the topology of the transmission link
- H04L2001/0097—Relays
Definitions
- a Slave Node is a node that receives and completes requests from the HNs.
- An SN could be used from peripheral or main memory.
- the HN may serialize accesses to an address on a first-come, first-served basis. For example, access to a designated device and resources of the HN may be reserved until a current transaction has been completed.
- a disadvantage of this approach is that HN resources may be reserved for longer than necessary, which may adversely affect system performance.
- EP 0 818 732 A2 discloses a multi-node processing units architecture caching copied data and implementing a cache coherency protocol wherein memory blocks are distributed among the nodes of a network including request node, home node, and slave node and implementing an hybrid cache protocol, including a directory cache protocol and a directory-less protocol.
- US 2013/042077 A1 discloses memory blocks distributed among the nodes of a network including request node, home node, and a memory controller , wherein an initiator request node is configured to respond to receipt of a snoop request.
- US 2012/079211 A1 discloses a plurality of data beats of the requested dat, where a first data beat of the plurality of data beats is received at a first time and a last data beat of the plurality of data beats is received at a second time, subsequent to the first time
- the present disclosure relates to a mechanism in a data processing network for speeding up a data fetch operation.
- the disclosed mechanism in additon to reducing the life time of a data fetch transaction in the interconnect, also reduces the the number of resources required to fully utilize the interconnect components.
- the disclosed mechanism improves the throughput of the transactions in the interconnect by chaining request response pairs from different sources.
- FIG. 1 is a block diagram of a data processing system 100, in accordance with various representative embodiments.
- a number of processing core clusters 102 (referred to as Request Nodes (RN's)) are coupled to data resources via coherent interconnect 104. Data is received via input/output (I/O) requesting nodes (RN-I).
- RN-I 106a comprises a network interface controller (NIC) that receives data from network 108 and RN-I 106b receives data from I/O device 112.
- NIC network interface controller
- I/O device 112 may be coupled via a peripheral component interconnect express (PCIe) bus, direct memory access (DMA) unit, or network accelerator, for example.
- PCIe peripheral component interconnect express
- DMA direct memory access
- HN Home Nodes
- HN-F Home Node
- HN-I Home Node
- 120 may provide an interface to off-chip resources or on-chip peripheral devices.
- Data requested by a Request Node 102 may be retrieved from a system level cache of the HN, from another Request Node, or from a memory 114 via a memory controller 116.
- the memory controllers are examples of Slave Nodes (SN's).
- the Home Nodes 118 act as points of serialization, processing read requests and other transactions in a serial manner, such as first-come, first-served.
- Coherent interconnect 104 is used to transfer data over data (DAT) channels between nodes.
- DAT data over data
- a messaging protocol is used to control each access transaction, in which requests and responses are sent over REQ RSP channels in the interconnect.
- 'snoop' messages are sent over SNP channels in the interconnect to ensure data coherence.
- An aspect of the present disclosure relates to an improved messaging and data transfer mechanism, implemented in the hardware of the nodes, that provides improved performance and efficiency of the data processing network.
- a data processing network comprises one or more request nodes (102) configured to access a shared data resource (e.g. 114), a Home Node (e.g. 118) that provides a point of coherency for data of the shared data resource and a coherent interconnect (104) configured to couple between the one or more request nodes and the Home Node.
- a request node sends a request to the Home Node. For example, data may be transferred in blocks having the size of one cache line.
- the DAT bus in the coherent interconnect has a width smaller than a cache line, the requested data is sent through the interconnect on the DAT channel as a plurality of data beats.
- the request node sends an acknowledgement message to the Home Node in response to receiving the first data beat.
- prior systems delay sending an acknowledgement until the last data beat has been received.
- the request node accepts snoop messages from the Home Node.
- the request node is configured to track when all requested data beats have arrived and treats snoops arriving in the interim in a different manner to other snoops.
- the Request Nodes were configured to not send an acknowledgement until all data beats were received. This prevents Home Nodes from sending snoops during this period. Consequently, resources of the Home Node are utilized for a longer period of time.
- the Request Node may buffer a snoop request from the Home Node for the data at the first address when the snoop request is received in the time period between the first time and the second time. Data may be sent in response to the snoop request after the last data beat of the plurality of data beats has been received.
- the Request Node forwards data beats of the requested data as they are received.
- the Home Node is configured to allocate resources of the Home Node when the read request is received, and to free the resources when the acknowledgement message, acknowledging receipt of a first beat of the plurality of data beats, is received from the Request Node.
- All communications are transmitted via the coherent interconnect.
- the data may be transmitted to the RN in multiple data beats across the interconnect.
- a completion acknowledgment (CompAck) message 210 is sent from the RN to the HN.
- the duration of the transaction is T1-T2 for the RN and T3-T4 for the HN.
- the HN assigns resources for the transaction (such as a tracker).
- the HN refrains from sending snoop messages to the RN for the addresses accessed in the Read transaction. Otherwise, for example, a snoop request may arrive at the RN prior to the arrival of the data from the HN.
- FIGs 3-4 are transaction flow diagrams of a mechanism for data access in a data processing network, in accordance with various representative embodiments.
- the figures show the dependencies for a transaction with a separate Data and Home response.
- FIG. 3 is a transaction flow diagram for data access in a data processing network, in accordance with various representative embodiments.
- vertical bars 302, 304 and 306 show time lines for a Request Node (RN), Home Node (HN) and Slave Node (SN), respectively, with time flowing from top to bottom.
- RN issues Read request 308 to the Home Node for the read address.
- ReadNoSnp request 310 is sent to the appropriate SN (such as a memory controller, for example).
- the SN sends the requested data to the RN (in four data beats - 312, 314, 316, 318).
- the HN is free to send a snoop message, if so requested, arriving at the RN.
- the RN is aware that data has been requested for the snooped address(es) and delays processing the snoop message until time T3, when all the requested data has been received by the RN.
- the RN gives a CompAck acknowledgement, the RN is indicating that it will accept responsibility to handle snoop hazards for any transaction that is scheduled after it.
- the line can be used and may be cached in any state and modified at the Request Node.
- the received data can be modified but the modified data must be forwarded to the snoop response, and modified data can be cached but must not be modified if the snoop is non-invalidating.
- the modified data should not be cached if the snoop is invalidating type.
- the HN waits at decision block 514 until a CompAck message is received from the RN, indicating that the first data has arrived.
- the HN releases its allocated resources and enables the sending of snoops to the RN at block 516.
- the HN participation in the transaction is then complete, as indicated by termination block 518.
- FIG. 5 shows an embodiment of a method of data transfer in a data processing network.
- a Home Node receives, at a first time, a request to read data at a first address in the network, where the request has been sent via the coherent interconnect from a Request Node of the data processing network.
- the Home Node performs a coherence action for the data at the first address dependent upon a presence of copies of the requested data in the data processing network. This may involve sending snoop messages to devices of the network having copies of the requested data.
- the coherence state of the data copies may be changed and/or the data may be written back to a memory, for example.
- the Home Node then causes the requested data to be transmitted to the Request Node in a plurality of data beats.
- the Home Node allows snoop requests for data at the first address to be sent to the Request Node.
- the Home Node When the Home Node receives the read request from the Request Node it allocates resources of the Home Node to enable performance of the coherency action and control of snoop messages. Once the acknowledgement message is received from the Request Node, acknowledging receipt of a first beat of the plurality of data beats, the resources of the Home Node are freed.
- the Request Node does not acknowledge receipt of the requested data until all data beats have been received.
- Home Node resources are used for a longer time period.
- the Home Node determines one or more locations where copies of the requested data are stored in the data processing network. This information may be stored in a presence vector of an entry in a snoop filter, for example.
- the requested data is stored in a cache of the Home Node, the plurality of data beats are transferred from the Home Node to the Request Node via the coherent interconnect.
- the Home Node sends a request for the data beats to be sent from that node to the Request Node.
- FIG. 6 is a flow chart of a method of operation 600 of a Request Node in a data processing network in accordance with representative embodiments.
- a read request is sent to a HN at block 604.
- the RN then waits at decision block 606 until the first data is received in response to the read request.
- the data may be received from the HN or directly from an SN.
- a CompAck message is sent to the HN at block 608.
- the snoop request is received, as depicted by the positive branch from decision block 610, the snoop is buffered at block 612 and no response is made.
- the RN When all data associated with the read request has been received, as depicted by the positive branch from decision block 614, the RN responds to any buffered snoop messages at block 616 and the method terminates at block 618. In this manner, the CompAck message is sent before all data has been received, thereby free the HN sooner.
- the RN may provide data in response to a snoop before all beats of data are received for its own request, the received data is treated with certain constraints. These constraints are dependent on the type of snoop.
- the data may be cached as shared and the cached data cannot be modified by the RN, as shown in block 712.
- the snoop is a SnpOnce message, as depicted by the positive branch from decision block 714, the data can be used and may be cached in any state and modified at the Request Node, as shown in block 716.
- the Snoop is an invalidating snoop, as depicted by the positive branch from decision block 718, the received data can be used only once and dropped and must not be cached, as shown in block 720, otherwise the method terminates at block 722.
- Request Node's request was a result of a store from the core then the received data can be modified but the modified data must be forwarded to the snoop response, and modified data can be cached but must not be modified if the snoop is non-invalidating. Must not be cached if the snoop is invalidating type.
- FIG. 6 and FIG. 7 show embodiments of a method of data transfer in a data processing network, consistent with the present disclosure.
- a Request Node of the data processing network sends a request to read data at a first address in the network.
- the request is sent via a coherent interconnect to a Home Node of the data processing network.
- a system address map may be used to determine which Home Node the request should be sent to.
- the Request Node receives a plurality of data beats of the requested data via the coherent interconnect, where a first data beat of the plurality of data beats is received at a first time and a last data beat of the plurality of data beats is received at a second time subsequent to the first time.
- the Request Node Responsive to receiving the first data beat, the Request Node sends an acknowledgement message to the Home Node via the coherent interconnect, and, subsequent to sending the acknowledgement message to the Home Node, the Request Node accepts snoop messages from the Home Node.
- an acknowledgement message is not sent until all data beats have been received by the Request Node, and the Home Node refrains from sending snoops for the first address to the Request Node during this period.
- the Request Node When the snoop request is an 'invalidating' request, the Request Node is configured to use, but not cache, the data.
- FIG. 8 is a flow chart of a method of operation 800 of a Request Node of a data processing network, in accordance with various representative embodiments. The method corresponds to an operation of Request Node RN1 in the transaction flow diagram shown in FIG. 4 , for example.
- Dedicated or reconfigurable hardware components used to implement the disclosed mechanisms may be described by instructions of a Hardware Description Language or by netlist of components and connectivity.
- the instructions or the netlist may be stored on non-transient computer readable medium such as Electrically Erasable Programmable Read Only Memory (EEPROM); non-volatile memory (NVM); mass storage such as a hard disc drive, floppy disc drive, optical disc drive; optical storage elements, magnetic storage elements, magneto-optical storage elements, flash memory, core memory and/or other equivalent storage technologies without departing from the present disclosure.
- EEPROM Electrically Erasable Programmable Read Only Memory
- NVM non-volatile memory
- mass storage such as a hard disc drive, floppy disc drive, optical disc drive
- optical storage elements magnetic storage elements, magneto-optical storage elements, flash memory, core memory and/or other equivalent storage technologies without departing from the present disclosure.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Mathematical Physics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Multi Processors (AREA)
Description
- A multi-processor data processing system may be arranged as an on-chip network with nodes of various types, such as processors, accelerators, IO, and memory connected via a coherent interconnect. At a high level, there are three basic node types, requestor, home and slave. A Request Node (RN) is a node that generates protocol transactions, including reads and writes, to the interconnect. These nodes could be fully coherent processors or IO coherent devices. A Home Node (HN) is a node that receives protocol transactions from RNs. Each address in the system has a Home which acts as the Point-of-Coherency (PoC) and Point of Serialization (PoS) for requests to that address. In a typical implementation, Homes for a range of addresses are grouped together as a Home Node. Each of these Home Nodes may include a system level cache and/or a snoop filter to reduce redundant snoops.
- A Slave Node (SN) is a node that receives and completes requests from the HNs. An SN could be used from peripheral or main memory.
- Data from a shared data resource may be accessed by a number of different processors and copies of the data may be stored in local caches for rapid access. A cache coherence protocol may be used to ensure that all copies are up to date. The protocol may involve the HN performing a coherency action that may include exchanging snoop messages with the RNs having copies of data being accessed.
- The HN may serialize accesses to an address on a first-come, first-served basis. For example, access to a designated device and resources of the HN may be reserved until a current transaction has been completed. A disadvantage of this approach is that HN resources may be reserved for longer than necessary, which may adversely affect system performance.
EP 0 818 732 A2 discloses a multi-node processing units architecture caching copied data and implementing a cache coherency protocol wherein memory blocks are distributed among the nodes of a network including request node, home node, and slave node and implementing an hybrid cache protocol, including a directory cache protocol and a directory-less protocol.US 2013/042077 A1 discloses memory blocks distributed among the nodes of a network including request node, home node, and a memory controller , wherein an initiator request node is configured to respond to receipt of a snoop request.US 2012/079211 A1 discloses a plurality of data beats of the requested dat, where a first data beat of the plurality of data beats is received at a first time and a last data beat of the plurality of data beats is received at a second time, subsequent to the first time - The accompanying drawings provide visual representations which will be used to more fully describe various representative embodiments and can be used by those skilled in the art to better understand the representative embodiments disclosed and their inherent advantages. In these drawings, like reference numerals identify corresponding elements.
-
FIG. 1 is a block diagram of a data processing network, in accordance with various representative embodiments. -
FIG. 2 is a transaction flow diagram for a conventional data access in a data processing network. -
FIGs 3-4 are transaction flow diagrams for data access, in accordance with various representative embodiments. -
FIG. 5 is a flow chart of a method of operation of a Home Node of a data processing network, in accordance with various representative embodiments. -
FIG. 6 is a flow chart of a method of operation of a Request Node of a data processing network, in accordance with various representative embodiments. -
FIG. 7 is a flow chart of a further method of operation of a Request Node of a data processing network, in accordance with various representative embodiments. -
FIG. 8 is a flow chart of a still further method of operation of a Request Node of a data processing network, in accordance with various representative embodiments. - The various apparatus and devices described herein provide mechanisms for automatic routing and allocation of incoming data in a data processing system.
- While this present disclosure is susceptible of embodiment in many different forms, there is shown in the drawings and will herein be described in detail specific embodiments, with the understanding that the present disclosure is to be considered as an example of the principles of the present disclosure and not intended to limit the present disclosure to the specific embodiments shown and described. In the description below, like reference numerals are used to describe the same, similar or corresponding parts in the several views of the drawings.
- In this document, relational terms such as first and second, top and bottom, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. An element preceded by "comprises ... a" does not, without more constraints, preclude the existence of additional identical elements in the process, method, article, or apparatus that comprises the element.
- Reference throughout this document to "one embodiment", "certain embodiments", "an embodiment" or similar terms means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of such phrases or in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments without limitation.
- The term "or" as used herein is to be interpreted as an inclusive or meaning any one or any combination. Therefore, "A, B or C" means "any of the following: A; B; C; A and B; A and C; B and C; A, B and C". An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.
- For simplicity and clarity of illustration, reference numerals may be repeated among the figures to indicate corresponding or analogous elements. Numerous details are set forth to provide an understanding of the embodiments described herein. The embodiments may be practiced without these details. In other instances, well-known methods, procedures, and components have not been described in detail to avoid obscuring the embodiments described. The description is not to be considered as limited to the scope of the embodiments described herein.
- The present disclosure relates to a mechanism in a data processing network for speeding up a data fetch operation. The disclosed mechanism, in additon to reducing the life time of a data fetch transaction in the interconnect, also reduces the the number of resources required to fully utilize the interconnect components. In addition, the disclosed mechanism improves the throughput of the transactions in the interconnect by chaining request response pairs from different sources.
-
FIG. 1 is a block diagram of adata processing system 100, in accordance with various representative embodiments. A number of processing core clusters 102 (referred to as Request Nodes (RN's)) are coupled to data resources viacoherent interconnect 104. Data is received via input/output (I/O) requesting nodes (RN-I). In the example shown, RN-I 106a comprises a network interface controller (NIC) that receives data fromnetwork 108 and RN-I 106b receives data from I/O device 112. I/O device 112 may be coupled via a peripheral component interconnect express (PCIe) bus, direct memory access (DMA) unit, or network accelerator, for example. Data may be stored in one or more memory orstorage devices 114 that are coupled tocoherent interconnect 104 via one ormore memory controllers 116. Home Nodes (HN) 118 and 120 may include system level caches. Each Home Node (HN) serves as a point of serialization and/or point of coherence for data stored at a given set of system addresses. A Home Node (HN-F), such as 118, may be a home for local resources. Alternatively, a Home Node (HN-I), such as 120, may provide an interface to off-chip resources or on-chip peripheral devices. Data requested by aRequest Node 102 may be retrieved from a system level cache of the HN, from another Request Node, or from amemory 114 via amemory controller 116. The memory controllers are examples of Slave Nodes (SN's). - To avoid conflicts when multiple RN's try to access the same memory location, the
Home Nodes 118 act as points of serialization, processing read requests and other transactions in a serial manner, such as first-come, first-served.Coherent interconnect 104 is used to transfer data over data (DAT) channels between nodes. In addition, a messaging protocol is used to control each access transaction, in which requests and responses are sent over REQ RSP channels in the interconnect. Finally, 'snoop' messages are sent over SNP channels in the interconnect to ensure data coherence. - An aspect of the present disclosure relates to an improved messaging and data transfer mechanism, implemented in the hardware of the nodes, that provides improved performance and efficiency of the data processing network.
- In one embodiment, a data processing network comprises one or more request nodes (102) configured to access a shared data resource (e.g. 114), a Home Node (e.g. 118) that provides a point of coherency for data of the shared data resource and a coherent interconnect (104) configured to couple between the one or more request nodes and the Home Node. To read data at a first address in the shared data resource, a request node sends a request to the Home Node. For example, data may be transferred in blocks having the size of one cache line. When the DAT bus in the coherent interconnect has a width smaller than a cache line, the requested data is sent through the interconnect on the DAT channel as a plurality of data beats. These data beats may take different paths through the interconnect and are not guaranteed to arrive in the order in which they were sent. For example, a first data beat of the plurality of data beats may be received at a first time and a last data beat received at a second time subsequent to the first time, but these may not correspond the first and last beats passed to the interconnect.
- In accordance with an aspect of the present disclosure, the request node sends an acknowledgement message to the Home Node in response to receiving the first data beat. In contrast, prior systems delay sending an acknowledgement until the last data beat has been received.
- Subsequent to sending the acknowledgement message to the Home Node, the request node accepts snoop messages from the Home Node. The request node is configured to track when all requested data beats have arrived and treats snoops arriving in the interim in a different manner to other snoops. In contrast, in prior systems, the Request Nodes were configured to not send an acknowledgement until all data beats were received. This prevents Home Nodes from sending snoops during this period. Consequently, resources of the Home Node are utilized for a longer period of time.
- The Home Node receives, at a third time, the request to read data at the first address and performs a coherence action for the data at the first address dependent upon a presence of copies of the requested data at various locations in the data processing network. The Home Node then causes the requested data to be transmitted to the request node in the plurality of data beats. At a fourth time, the acknowledgement message is received from the request node. In the time period between the third time and the fourth time, the Home Node does not send any snoop request to the Request Node for data at the first address. However, subsequent to the fourth time, the Home Node allows snoop requests for data at the first address to be sent to the Request Node.
- In one embodiment, the Request Node may buffer a snoop request from the Home Node for the data at the first address when the snoop request is received in the time period between the first time and the second time. Data may be sent in response to the snoop request after the last data beat of the plurality of data beats has been received.
- In a further embodiment, when a snoop request from the Home Node for the data at the first address is received in the time period between the first time and the second time. the Request Node forwards data beats of the requested data as they are received.
- The Home Node is configured to allocate resources of the Home Node when the read request is received, and to free the resources when the acknowledgement message, acknowledging receipt of a first beat of the plurality of data beats, is received from the Request Node.
- These mechanisms enable Home Node resources to be allocated for a shorter period of time, freeing the resources for other transactions. They also enable data to be shared between multiple Request Nodes with less latency.
-
FIG. 2 is a transaction flow diagram for a conventional data access in a data processing network. InFIG. 2 ,vertical bars request 206 to the Home Node for the read address. Assuming the requested data is in the cache of the HN, the HN sends the requested data 208 (in four data beats, CompData0, CompData1, CompData2 and CompData3, in the example shown) to the RN. The 'CompData' message contains the data and an indication that transaction is complete with respect to that data. That is, the RN can consider the read transaction to be globally observed, as there is no action which can modify the read data received. - All communications are transmitted via the coherent interconnect. The data may be transmitted to the RN in multiple data beats across the interconnect. Once all of the requested data has been received by the RN, a completion acknowledgment (CompAck)
message 210 is sent from the RN to the HN. Thus, the duration of the transaction is T1-T2 for the RN and T3-T4 for the HN. During the period T3-T4, the HN assigns resources for the transaction (such as a tracker). In addition, the HN refrains from sending snoop messages to the RN for the addresses accessed in the Read transaction. Otherwise, for example, a snoop request may arrive at the RN prior to the arrival of the data from the HN. - HN must wait for
CompAck response 210, before sending a snoop, since the data and the snoop may take different paths through the interconnect, whereby the snoop arrives at the RN before some of the Data. - If the requested data is not present in the cache of the HN, the data is requested from a Slave Node (SN), such as a memory controller, I/O device or an off-chip data resource. The requested data may be sent directly from SN (via the coherent interconnect), or sent via the HN (and the coherent interconnect) to the RN.
-
FIGs 3-4 are transaction flow diagrams of a mechanism for data access in a data processing network, in accordance with various representative embodiments. The figures and the associated discussion below, describe the transaction structure and the dependencies that exist within an example transaction. The figures show the dependencies for a transaction with a separate Data and Home response. -
FIG. 3 is a transaction flow diagram for data access in a data processing network, in accordance with various representative embodiments. Analogous toFIG. 2 , inFIG. 3 ,vertical bars request 308 to the Home Node for the read address. If the request data is not in the cache of the HN, ReadNoSnp request 310 is sent to the appropriate SN (such as a memory controller, for example). The SN sends the requested data to the RN (in four data beats - 312, 314, 316, 318). It is noted that the beats may arrive out of order having taken different routes through the interconnect. In the mechanism shown inFIG. 2 , the RN sendsCompAck 320 to the HN at time T2 when the first data has arrived at the RN. When the HN receivesCompAck 320, it releases resources allocated to Read transaction. The resources of the HN are in use for the time period T5-T6. In contrast, the mechanism shown inFIG. 2 , theCompAck 322 would have been sent at time T3. The resources of the HN would have been occupied for the time period T5-T7, which is considerably longer than the period T5-T6 for the disclosed mechanism. - Once the transaction is completed in the HN (i.e., at time T6), the HN is free to send a snoop message, if so requested, arriving at the RN. The RN is aware that data has been requested for the snooped address(es) and delays processing the snoop message until time T3, when all the requested data has been received by the RN. When the RN gives a CompAck acknowledgement, the RN is indicating that it will accept responsibility to handle snoop hazards for any transaction that is scheduled after it.
- As before, if the requested data is present in the cache of the HN, the data is sent directly from the HN to the RN and no messages are exchanged with the SN.
- The mechanism shown in
FIG. 3 reduces the time period over which resources of the HN are allocated to the Read transaction, thereby allowing for an increase in overall system performance. -
FIG. 4 is a further transaction flow diagram for data access in a data processing network, in accordance with various representative embodiments. InFIG. 4 ,vertical bars vertical bars ReadShared request 412 from RN1 for a read address. Subsequently, HN receives requests, denoted as ReadShared2 and ReadShared3 from RN2 and RN3, respectively. The HN acts as a Point of Serialization (PoS) and processes these requests in the order in which they are received. The HN also acts as Point of Coherency (PoC) and performs a coherency action whenrequest 412 is received. In the example shown, the HN sendsReadNoSnp message 414 to the Slave Node. In response, the SN sends the requested data in four beats, denoted as Compdata0 (416), CompData1, CompData2 and CompData3 (418), to the requesting node RN1. When the first data beat (CompData0, 416) is received at RN1, RN1 sends aCompAck 420 message to the HN, as discussed above with reference toFIG. 3 . The HN receives theCompAck message 420 at time T2 and is then able to free resources allocated to the Read transaction. In particular, HN is permitted to send snoop requests, such as Snoop2, to RN1. - When RN2 and RN3 have requested the same shared data as RN1, the data is forwarded from RN1. When Compdata0 arrives at RN2, RN2 sends a
CompAck message 422, denoted as CompAck2, to HN. The HN receives theCompAck2 message 422 at time T3 and is then permitted to send snoop requests, such as Snoop3, to RN2. Similarly, when Compdata0 arrives at RN3, RN3 sends aCompAck message 424, denoted as CompAck3, to HN. The HN receives theCompAck3 message 424 at time T4 and is then permitted to send snoop requests to RN3. In this way, resources of HN are only allocated for the time period T1-T2. - When an RN provides data in response to a snoop before all beats of data are received for its own request, the received data is treated with certain constraints. These constraints are dependent on the type of snoop.
- When the snoop is non-invalidating and not a SnpOnce_message, the line may be used and cached in shared state at the Request Node. The cached copy must not be modified by the Request Node.
- When the snoop is a SnpOnce_message, the line can be used and may be cached in any state and modified at the Request Node.
- When the Snoop is an invalidating snoop, the received data can be used only once and dropped and must not be cached.
- In all the above cases, when the Request Node's request was a result of a store from the core then the received data can be modified but the modified data must be forwarded to the snoop response, and modified data can be cached but must not be modified if the snoop is non-invalidating. The modified data should not be cached if the snoop is invalidating type.
- The disclosed mechanism allows the Home Node to release resources with hazards and other transaction resources early, enabling the Home Node resources to be optimally utilized with minimum overhead. Also, this scales well with system size since the interconnect size, and thus increased data packet traversal latencies, do not require Home Node resources to be increased proportionately.
-
FIG. 5 is a flow chart of a method ofoperation 500 of a Home Node in a data processing network in accordance with representative embodiments. Followingstart block 502, a read request is received by the HN atblock 504. Atblock 506, resources are reserved within the HN and snoops to the requested read address(es) are blocked. If the requested data is not present in the system cache of the HN, as depicted by the negative branch fromdecision block 508, a request (ReadNoSnp, for example) for the data is sent to the appropriate Slave Node atblock 510. Otherwise, as depicted by the positive branch fromdecision block 508, transfer of the date from the system cache begins ablock 512. The HN waits atdecision block 514 until a CompAck message is received from the RN, indicating that the first data has arrived. When the CompAck message is received, as depicted by the positive branch fromdecision block 514, the HN releases its allocated resources and enables the sending of snoops to the RN atblock 516. The HN participation in the transaction is then complete, as indicated bytermination block 518. -
FIG. 5 shows an embodiment of a method of data transfer in a data processing network. A Home Node receives, at a first time, a request to read data at a first address in the network, where the request has been sent via the coherent interconnect from a Request Node of the data processing network. The Home Node performs a coherence action for the data at the first address dependent upon a presence of copies of the requested data in the data processing network. This may involve sending snoop messages to devices of the network having copies of the requested data. The coherence state of the data copies may be changed and/or the data may be written back to a memory, for example. The Home Node then causes the requested data to be transmitted to the Request Node in a plurality of data beats. For example, the data may be transmitted from the Home Node when it is present in a system cache of the Home Node. Alternatively, the data may be transmitted from another Request Node having a copy of the data or from a Slave Node, such as a memory management unit. When the first data beat is received by the Request Node, it sends an acknowledgement message to the Home Node. In a time period between the first time and the second time, the Home Node may receive data requests from other Request Nodes for data at the first address. However, the Home Node does not send any snoop request to the Request Node for the data during this time period. - Subsequent to the second time, the Home Node allows snoop requests for data at the first address to be sent to the Request Node.
- When the Home Node receives the read request from the Request Node it allocates resources of the Home Node to enable performance of the coherency action and control of snoop messages. Once the acknowledgement message is received from the Request Node, acknowledging receipt of a first beat of the plurality of data beats, the resources of the Home Node are freed.
- In contrast, in prior systems, the Request Node does not acknowledge receipt of the requested data until all data beats have been received. As a result, Home Node resources are used for a longer time period.
- When a read request is received by the Home Node, the Home Node determines one or more locations where copies of the requested data are stored in the data processing network. This information may be stored in a presence vector of an entry in a snoop filter, for example. When the requested data is stored in a cache of the Home Node, the plurality of data beats are transferred from the Home Node to the Request Node via the coherent interconnect. When the requested data is stored at a different network node (such as Slave Node or another Request Node), the Home Node sends a request for the data beats to be sent from that node to the Request Node.
-
FIG. 6 is a flow chart of a method ofoperation 600 of a Request Node in a data processing network in accordance with representative embodiments. Followingstart block 602, a read request is sent to a HN atblock 604. The RN then waits atdecision block 606 until the first data is received in response to the read request. The data may be received from the HN or directly from an SN. When the first data is received, as depicted by the positive branch fromdecision block 606, a CompAck message is sent to the HN atblock 608. Subsequently, when a snoop request is received, as depicted by the positive branch fromdecision block 610, the snoop is buffered atblock 612 and no response is made. When all data associated with the read request has been received, as depicted by the positive branch fromdecision block 614, the RN responds to any buffered snoop messages atblock 616 and the method terminates atblock 618. In this manner, the CompAck message is sent before all data has been received, thereby free the HN sooner. - In an alternative embodiment, as shown in
FIG. 7 , the RN may provide data in response to a snoop before all beats of data are received for its own request, the received data is treated with certain constraints. These constraints are dependent on the type of snoop. -
FIG. 7 is a flow chart of amethod 700 for responding to snoops in a Request Node, in accordance with embodiments of the disclosure. Followingstart block 702 inFIG. 7 , a snoop is received by the RN atblock 704, following a request for data by the RN. If all beats of the requested data have been received, as depicted by the positive branch fromdecision block 706, the snoop is responded to in the usual manner atblock 708. Otherwise, as depicted by the negative branch fromdecision block 706, the RN applies certain constraints to the data. If the snoop is not an invaliding snoop or a SnpOnce_request, as depicted by the negative branch fromdecision block 710, the RN may use the requested data. The data may be cached as shared and the cached data cannot be modified by the RN, as shown inblock 712. When the snoop is a SnpOnce message, as depicted by the positive branch fromdecision block 714, the data can be used and may be cached in any state and modified at the Request Node, as shown inblock 716. When the Snoop is an invalidating snoop, as depicted by the positive branch fromdecision block 718, the received data can be used only once and dropped and must not be cached, as shown inblock 720, otherwise the method terminates atblock 722. In all the above cases, when Request Node's request was a result of a store from the core then the received data can be modified but the modified data must be forwarded to the snoop response, and modified data can be cached but must not be modified if the snoop is non-invalidating. Must not be cached if the snoop is invalidating type. -
FIG. 6 andFIG. 7 show embodiments of a method of data transfer in a data processing network, consistent with the present disclosure. In accordance with embodiments of the disclosure, a Request Node of the data processing network sends a request to read data at a first address in the network. The request is sent via a coherent interconnect to a Home Node of the data processing network. A system address map may be used to determine which Home Node the request should be sent to. In response to the request, the Request Node receives a plurality of data beats of the requested data via the coherent interconnect, where a first data beat of the plurality of data beats is received at a first time and a last data beat of the plurality of data beats is received at a second time subsequent to the first time. Responsive to receiving the first data beat, the Request Node sends an acknowledgement message to the Home Node via the coherent interconnect, and, subsequent to sending the acknowledgement message to the Home Node, the Request Node accepts snoop messages from the Home Node. - In contrast, in prior systems, an acknowledgement message is not sent until all data beats have been received by the Request Node, and the Home Node refrains from sending snoops for the first address to the Request Node during this period.
- After the acknowledge message has been received by the Home Node, the home is free to send snoop messages for the first address (or any other address) to the Request Node. In one embodiment, the time period between the first time and the second time, the Request Node buffers any snoop requests from the Home Node for the data at the first address. The Request Node processes these snoop messages after the last data beat of the plurality of data beats has been received by the Request Node. In a further embodiment, when (during the time period between the first time and the second time) a snoop request is received from the Home Node for the data at the first address, the Request Node forwards data beats of the requested data as they are received by the Request Node. In this embodiment, the forwarded data arrives at its target destination sooner than it would have done if the Request Node or home had waited for all data beats to be received before servicing another request.
- In accordance with certain embodiments, a snoop request for data at the first address is received by the Request Node during the time period between the first time and the second time, the received data handled in various ways by the Request Node, as shown in
FIG.7 , for example. When the snoop request is neither a 'SnpOnce' request nor an 'invalidating' request, the Request Node is configured to use, modify and cache the received data. When the snoop request is a ' SnpOnce' request, the Request Node is configured to use the received data, and cache the received data in a 'shared' state, but not modify the data. - When the snoop request is an 'invalidating' request, the Request Node is configured to use, but not cache, the data.
-
FIG. 8 is a flow chart of a method ofoperation 800 of a Request Node of a data processing network, in accordance with various representative embodiments. The method corresponds to an operation of Request Node RN1 in the transaction flow diagram shown inFIG. 4 , for example. - Following
start block 802 inFIG. 8 , a read request is sent to a HN atblock 804. The RN then waits atdecision block 806 until the first data is received in response to the read request. The data may be received from the HN or directly from an SN or another RN. When the first data beat is received, as depicted by the positive branch fromdecision block 806, a CompAck message is sent to the HN atblock 808. Subsequently, when a new snoop request is received (for the data requested at block 804), as depicted by the positive branch fromdecision block 810, the received data beat (or beats) are forwarded to the target node indicated in the snoop atblock 812. Subsequently, when additional data beats are received by the RN, the data beats are forwarded to the target node, as depicted byblock 814. When all data beats associated with the read request has been received, as depicted by the positive branch fromdecision block 816, the RN forwards any remaining data beats to the snoop target(s) atblock 818 and the method terminates atblock 820. In this manner, the CompAck message is sent before all data has been received, thereby freeing the HN sooner. - Those skilled in the art will recognize that the present disclosure has been described in terms of exemplary embodiment. The present disclosure could be implemented using hardware components such as special purpose hardware and/or dedicated processors. Similarly, dedicated processors and/or dedicated hard wired logic may be used to construct alternative embodiments of the present disclosure.
- Dedicated or reconfigurable hardware components used to implement the disclosed mechanisms may be described by instructions of a Hardware Description Language or by netlist of components and connectivity. The instructions or the netlist may be stored on non-transient computer readable medium such as Electrically Erasable Programmable Read Only Memory (EEPROM); non-volatile memory (NVM); mass storage such as a hard disc drive, floppy disc drive, optical disc drive; optical storage elements, magnetic storage elements, magneto-optical storage elements, flash memory, core memory and/or other equivalent storage technologies without departing from the present disclosure.
- Various embodiments described herein are implemented using dedicated hardware, configurable hardware or programmed processors executing programming instructions that are broadly described in flow chart form that can be stored on any suitable electronic storage medium or transmitted over any suitable electronic communication medium. A combination of these elements may be used. Those skilled in the art will appreciate that the processes and mechanisms described above can be implemented in any number of variations without departing from the present disclosure. For example, the order of certain operations carried out can often be varied, additional operations can be added or operations can be deleted without departing from the present disclosure. Such variations are contemplated.
- The various representative embodiments, which have been described in detail herein, have been presented by way of example and not by way of limitation. It will be understood by those skilled in the art that various changes may be made in the form and details of the described embodiments resulting in embodiments that remain within the disclosed present disclosure. The invention is set out in the appended independent claims. Advantageous embodiments are defined by the appended dependent claims.
Claims (14)
- A method of data transfer in a data processing network, the method comprising:sending, by a Request Node (102) of the data processing network, a request to read data at afirst address in the network, the request sent via a coherent interconnect to a Home Node (118) of the data processing network that is associated with the first address;receiving, by the Request Node via the coherent interconnect, a plurality of data beats of the requested data, where a first data beat of the plurality of data beats is received at a first time and a last data beat of the plurality of data beats is received at a second time subsequent to the first time;responsive to receiving the first data beat and prior to receiving the last data beat, the Request Node sending an acknowledgement message to the Home Node via the coherent interconnect; andsubsequent to sending the acknowledgement message to the Home Node and prior to the second time, the Request Node accepting snoop requests for the first address from the Home Node.
- The method of claim 1, further comprising:
buffering, by the Request Node, a snoop request for data at the first address received from the Home Node, when the snoop request is received in the time period between the first time and the second time. - The method of claim 2, further comprising:
the Request Node sending data in response to the snoop request after the last data beat of the plurality of data beats has been received by the Request Node. - The method of claim 1, further comprising, responsive to a snoop request for data at the first address received by the Request Node in a time period between the first time and the second time:
forwarding, by the Request Node, data beats of the requested data received by the Request Node. - The method of any preceding claim, where the plurality of data beats are received by the Request Node via the coherent interconnect from a Slave Node of the data processing system, from a further Request Node of the data processing system, or from the Home Node.
- The method of claim 5, further comprising:
requesting, by the Home Node, the data to be sent from the Slave Node to the Request Node via the coherent interconnect, where the plurality of data beats are sent from the further Request Node responsive to a snoop request received at the further Request Node from the Home Node. - The method of claim 1, further comprising, responsive to a snoop request for data at the first address received by the Request Node in a time period between the first time and the second time:when the snoop request is neither a 'snoop once' request nor an 'invalidating' request, configuring the Request Node to use, modify and cache the received data; andwhen the snoop request is a `snoop once' request, configuring the Request Node to use the received data, and cache the received data in a 'shared' state, but not modify the data; andwhen the snoop request is an 'invalidating' request, configuring the Request Node to use but not cache the data.
- A method of data transfer in a data processing network, the method comprising:receiving at a first time, by a Home Node (118) of the data processing network, a request toread data at a first address in the network, the request sent via a coherent interconnect from a Request Node (102) of the data processing network, where the Home Node isassociated with the first address;performing, by the Home Node, a coherence action for the data at the first address dependent upon a presence of copies of the requested data in the data processing network;causing, by the Home Node, the requested data to be transmitted to the Request Node in a plurality of data beats;receiving at a second time, by the Home Node, an acknowledgement message from the Request Node acknowledging receipt of a first beat of the plurality of data beats,the second time prior to receipt of a last data beat of the plurality of data beats by the Request Node;in a time period between the first time and the second time, the Home Node not sending any snoop request to the Request Node for data at the first address; andsubsequent to the second time, the Home Node allowing snoop requests for data at the first address to be sent to the Request Node.
- The method of claim 8, further comprising:allocating, by the Home Node responsive to receiving the request, resources of the Home Node to the read request from the Request Node; andfreeing, by the Home Node, the resources of the Home Node responsive to receiving the acknowledgement message from the Request Node acknowledging receipt of the first beat of the plurality of data beats.
- The method of claim 8 or claim 9, further comprising:determining, by the Home Node, one or more locations of the requested data in the data processing network;transferring, by the Home Node, the plurality of data beats to the Request Node via the coherent interconnect when the requested data is stored in a cache of the Home Node; andsending, by the Home Node, a request to a further node of the data processing network when the requested data is stored at the further node, where the further node comprises a further Request Node having a copy of the requested data or a Slave Node of the data processing network.
- A data processing network comprising:one or more Request Nodes (102) configured to access a shared data resource;a Home Node (118) that provides a point of coherency for data of the shared data resource;a coherent interconnect (104) configured to couple between the one or more Request Nodes and the Home Node;where a Request Node of the one or more Request Nodes is configured to perform a method comprising:sending a request to read data at a first address in the shared data resource to the Home Node;receiving a plurality of data beats of the requested data, where a first data beat of the plurality of data beats is received at a first time and a last data beat of the plurality of data beats is received at a second time subsequent to the first time;responsive to receiving the first data beat and prior to receiving the last data beat,sending an acknowledgement message to the Home Node; andsubsequent to sending the acknowledgement message to the Home Node and prior to the second time, accepting snoop requests from the Home Node, andwhere the Home Node is configured to perform a method comprising:receiving at a third time the request to read data at the first address;performing a coherence action for the data at the first address dependent upon locations of copies of the requested data in the data processing network;causing the requested data to be transmitted to the Request Node in the plurality of data beats;receiving at a fourth time, the acknowledgement message from the Request Node acknowledging receipt of the first beat of the plurality of data beats, the fourth time prior to receipt of a last data beat of the plurality of data beats by the Request Node;in a time period between the third time and the fourth time, the Home Node not sending any snoop request to the Request Node for data at the first address; andsubsequent to the fourth time, allowing snoop requests for data at the first address to be sent to the Request Node.
- The data processing network of claim 11, where the Request Node is further configured to buffer a snoop request from the Home Node for the data at the first address when the snoop request is received in the time period between the first time and the second time.
- The data processing network of claim 12, where the Request Node is further configured to send data in response to the snoop request after the last data beat of the plurality of data beats has been received.
- The data processing network of any of claims 11 or 12, where, responsive to a snoop request from the Home Node for the data at the first address received in the time period between the first time and the second time, the Request Node is further configured to forward data beats of the requested data as they are received and, where the Home Node is further configured to allocate, responsive to receiving the request, resources of the Home Node to the read request from the Request Node, and configured to free the resources responsive to receiving the acknowledgement message from the Request Node acknowledging receipt of a first beat of the plurality of data beats.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862666256P | 2018-05-03 | 2018-05-03 | |
US16/027,864 US10917198B2 (en) | 2018-05-03 | 2018-07-05 | Transfer protocol in a data processing network |
PCT/GB2019/051215 WO2019211609A1 (en) | 2018-05-03 | 2019-05-02 | Transfer protocol in a data processing network |
Publications (3)
Publication Number | Publication Date |
---|---|
EP3788494A1 EP3788494A1 (en) | 2021-03-10 |
EP3788494C0 EP3788494C0 (en) | 2023-12-27 |
EP3788494B1 true EP3788494B1 (en) | 2023-12-27 |
Family
ID=68383972
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP19723174.9A Active EP3788494B1 (en) | 2018-05-03 | 2019-05-02 | Transfer protocol in a data processing network |
Country Status (6)
Country | Link |
---|---|
US (1) | US10917198B2 (en) |
EP (1) | EP3788494B1 (en) |
JP (1) | JP7284191B2 (en) |
KR (1) | KR20210005194A (en) |
CN (1) | CN112136118B (en) |
WO (1) | WO2019211609A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GR20180100189A (en) | 2018-05-03 | 2020-01-22 | Arm Limited | Data processing system with flow condensation for data transfer via streaming |
US11741028B1 (en) * | 2022-05-20 | 2023-08-29 | Qualcomm Incorporated | Efficiently striping ordered PCIe writes across multiple socket-to-socket links |
US12079132B2 (en) * | 2023-01-26 | 2024-09-03 | Arm Limited | Method and apparatus for efficient chip-to-chip data transfer |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0818732A2 (en) * | 1996-07-01 | 1998-01-14 | Sun Microsystems, Inc. | Hybrid memory access protocol in a distributed shared memory computer system |
US20120079211A1 (en) * | 2010-09-28 | 2012-03-29 | Arm Limited | Coherency control with writeback ordering |
US20130042077A1 (en) * | 2011-08-08 | 2013-02-14 | ARM, Limited | Data hazard handling for copending data access requests |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0718772A1 (en) * | 1994-12-14 | 1996-06-26 | International Business Machines Corporation | Method to improve bus latency and to allow burst transfers of unknown length |
US5893160A (en) * | 1996-04-08 | 1999-04-06 | Sun Microsystems, Inc. | Deterministic distributed multi-cache coherence method and system |
US5887138A (en) * | 1996-07-01 | 1999-03-23 | Sun Microsystems, Inc. | Multiprocessing computer system employing local and global address spaces and COMA and NUMA access modes |
US6631401B1 (en) * | 1998-12-21 | 2003-10-07 | Advanced Micro Devices, Inc. | Flexible probe/probe response routing for maintaining coherency |
US7234029B2 (en) * | 2000-12-28 | 2007-06-19 | Intel Corporation | Method and apparatus for reducing memory latency in a cache coherent multi-node architecture |
US6954829B2 (en) * | 2002-12-19 | 2005-10-11 | Intel Corporation | Non-speculative distributed conflict resolution for a cache coherency protocol |
US7856534B2 (en) * | 2004-01-15 | 2010-12-21 | Hewlett-Packard Development Company, L.P. | Transaction references for requests in a multi-processor network |
US7779210B2 (en) * | 2007-10-31 | 2010-08-17 | Intel Corporation | Avoiding snoop response dependency |
US8250311B2 (en) * | 2008-07-07 | 2012-08-21 | Intel Corporation | Satisfying memory ordering requirements between partial reads and non-snoop accesses |
US8799586B2 (en) * | 2009-09-30 | 2014-08-05 | Intel Corporation | Memory mirroring and migration at home agent |
US8935485B2 (en) * | 2011-08-08 | 2015-01-13 | Arm Limited | Snoop filter and non-inclusive shared cache memory |
US8775904B2 (en) * | 2011-12-07 | 2014-07-08 | International Business Machines Corporation | Efficient storage of meta-bits within a system memory |
CN103036717B (en) * | 2012-12-12 | 2015-11-04 | 北京邮电大学 | System and method for maintaining consistency of distributed data |
US20150261677A1 (en) * | 2014-03-12 | 2015-09-17 | Silicon Graphics International Corp. | Apparatus and Method of Resolving Protocol Conflicts in an Unordered Network |
US10120809B2 (en) * | 2015-09-26 | 2018-11-06 | Intel Corporation | Method, apparatus, and system for allocating cache using traffic class |
CN106713250B (en) * | 2015-11-18 | 2019-08-20 | 杭州华为数字技术有限公司 | Data access method and device based on distributed system |
-
2018
- 2018-07-05 US US16/027,864 patent/US10917198B2/en active Active
-
2019
- 2019-05-02 KR KR1020207034228A patent/KR20210005194A/en active Pending
- 2019-05-02 WO PCT/GB2019/051215 patent/WO2019211609A1/en active Application Filing
- 2019-05-02 CN CN201980029863.2A patent/CN112136118B/en active Active
- 2019-05-02 JP JP2020561745A patent/JP7284191B2/en active Active
- 2019-05-02 EP EP19723174.9A patent/EP3788494B1/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0818732A2 (en) * | 1996-07-01 | 1998-01-14 | Sun Microsystems, Inc. | Hybrid memory access protocol in a distributed shared memory computer system |
US20120079211A1 (en) * | 2010-09-28 | 2012-03-29 | Arm Limited | Coherency control with writeback ordering |
US20130042077A1 (en) * | 2011-08-08 | 2013-02-14 | ARM, Limited | Data hazard handling for copending data access requests |
Also Published As
Publication number | Publication date |
---|---|
JP2021522610A (en) | 2021-08-30 |
CN112136118A (en) | 2020-12-25 |
EP3788494C0 (en) | 2023-12-27 |
JP7284191B2 (en) | 2023-05-30 |
WO2019211609A1 (en) | 2019-11-07 |
CN112136118B (en) | 2024-09-10 |
US20190342034A1 (en) | 2019-11-07 |
KR20210005194A (en) | 2021-01-13 |
US10917198B2 (en) | 2021-02-09 |
EP3788494A1 (en) | 2021-03-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10169080B2 (en) | Method for work scheduling in a multi-chip system | |
JP7153441B2 (en) | Data processing | |
US9529532B2 (en) | Method and apparatus for memory allocation in a multi-node system | |
KR100324975B1 (en) | Non-uniform memory access(numa) data processing system that buffers potential third node transactions to decrease communication latency | |
US20150254182A1 (en) | Multi-core network processor interconnect with multi-node connection | |
US10592459B2 (en) | Method and system for ordering I/O access in a multi-node environment | |
US9372800B2 (en) | Inter-chip interconnect protocol for a multi-chip system | |
JP7419261B2 (en) | Data processing network using flow compression for streaming data transfer | |
EP3788492B1 (en) | Separating completion and data responses for higher read throughput and lower link utilization in a data processing network | |
EP3788494B1 (en) | Transfer protocol in a data processing network | |
US10437725B2 (en) | Master requesting missing segments of a cache line for which the master has coherence ownership | |
US10642760B2 (en) | Techniques for command arbitation in symmetric multiprocessor systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
17P | Request for examination filed |
Effective date: 20201104 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAV | Request for validation of the european patent (deleted) | ||
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
17Q | First examination report despatched |
Effective date: 20211117 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: H04L 1/00 20060101ALI20230627BHEP Ipc: G06F 12/0817 20160101ALI20230627BHEP Ipc: G06F 12/0813 20160101ALI20230627BHEP Ipc: G06F 12/0831 20160101AFI20230627BHEP |
|
INTG | Intention to grant announced |
Effective date: 20230717 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602019043991 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
U01 | Request for unitary effect filed |
Effective date: 20240126 |
|
U07 | Unitary effect registered |
Designated state(s): AT BE BG DE DK EE FI FR IT LT LU LV MT NL PT SE SI Effective date: 20240208 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20240328 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20240328 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
U20 | Renewal fee paid [unitary effect] |
Year of fee payment: 6 Effective date: 20240418 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20240327 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20240427 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20240418 Year of fee payment: 6 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20240427 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602019043991 Country of ref document: DE |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20240930 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20231227 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20240531 |