[go: up one dir, main page]

CN114298892A - A cache module and system for distributed processing unit - Google Patents

A cache module and system for distributed processing unit Download PDF

Info

Publication number
CN114298892A
CN114298892A CN202111635416.7A CN202111635416A CN114298892A CN 114298892 A CN114298892 A CN 114298892A CN 202111635416 A CN202111635416 A CN 202111635416A CN 114298892 A CN114298892 A CN 114298892A
Authority
CN
China
Prior art keywords
module
data
cache
request
modules
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111635416.7A
Other languages
Chinese (zh)
Inventor
胡泰龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changsha Jingmei Integrated Circuit Design Co ltd
Changsha Jingjia Microelectronics Co ltd
Original Assignee
Changsha Jingmei Integrated Circuit Design Co ltd
Changsha Jingjia Microelectronics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changsha Jingmei Integrated Circuit Design Co ltd, Changsha Jingjia Microelectronics Co ltd filed Critical Changsha Jingmei Integrated Circuit Design Co ltd
Priority to CN202111635416.7A priority Critical patent/CN114298892A/en
Publication of CN114298892A publication Critical patent/CN114298892A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The application provides a cache module and a system applied to a distributed processing unit, wherein the cache module comprises: the device comprises a control module, a storage module and a sniffing module; the control module is used for receiving the data request and processing the data request; the storage module is used for caching data; and the sniffing module is used for receiving and executing the data consistency instruction. The cache module provided by the application ensures the adaptability with the MESI data consistency protocol and the variation thereof through the control module and the sniffing module, and can effectively improve the efficiency of data caching in parallel computing.

Description

一种应用于分布式处理单元的高速缓存模块和系统A cache module and system for distributed processing units

技术领域technical field

本申请涉及图像处理技术领域,尤其涉及一种应用于分布式处理单元的高速缓存模块和系统。The present application relates to the technical field of image processing, and in particular, to a cache module and system applied to distributed processing units.

背景技术Background technique

在计算机处理单元执行指令的过程中,为了加快涉及的数据读取和写入速度,会利用高速缓存(Cache)作为数据读写的位置,在运算结束后再将缓存中的数据更新到主存中。在多核中央处理器(Central Processing Unit,CPU)和图形处理器(GraphicsProcessing Unit,GPU)的场景下,多个处理单元会有各自的高速缓存,在它们并行运算时,在主存中的同一数据可能被不同处理单元读取并修改,需要一定的机制来保证读写过程中数据在主存与各个缓存之间的一致性。In the process of executing instructions by the computer processing unit, in order to speed up the reading and writing of data involved, the cache will be used as the location for reading and writing data, and the data in the cache will be updated to the main memory after the operation is completed. middle. In the scenario of multi-core central processing unit (CPU) and graphics processing unit (Graphics Processing Unit, GPU), multiple processing units will have their own caches, and when they operate in parallel, the same data in main memory It may be read and modified by different processing units, and a certain mechanism is required to ensure the consistency of the data between the main memory and each cache during the reading and writing process.

在图形处理与计算领域,渲染、纹理滤波等对于计算效能与吞吐量的要求很高,需要一种适用于分布式处理单元的高速缓存方案。In the field of graphics processing and computing, rendering, texture filtering, etc. have high requirements on computing performance and throughput, and a cache solution suitable for distributed processing units is required.

发明内容SUMMARY OF THE INVENTION

为了解决上述技术缺陷之一,本申请提供了一种应用于分布式处理单元的高速缓存模块和系统。In order to solve one of the above technical defects, the present application provides a cache module and system applied to a distributed processing unit.

本申请第一个方面,提供了一种应用于分布式处理单元的高速缓存模块,所述高速缓存模块包括:控制模块、存储模块和嗅探模块;A first aspect of the present application provides a cache module applied to a distributed processing unit, the cache module comprising: a control module, a storage module and a sniffing module;

所述控制模块,用于接收数据请求并对数据请求进行处理;The control module is used to receive and process the data request;

所述存储模块,用于缓存数据;The storage module is used to cache data;

所述嗅探模块,用于接收并执行数据一致性指令。The sniffing module is used for receiving and executing data consistency instructions.

可选地,所述控制模块,用于接收处理单元的数据请求,根据所述数据请求判断请求数据是否存储在所述存储模块中。Optionally, the control module is configured to receive a data request from the processing unit, and determine whether the requested data is stored in the storage module according to the data request.

可选地,所述控制模块,用于在判断请求数据存储在所述存储模块后,返回一个命中信号,并从所述存储模块中读取所述请求数据,处理所述请求数据。Optionally, the control module is configured to return a hit signal after judging that the request data is stored in the storage module, read the request data from the storage module, and process the request data.

可选地,所述控制模块,用于修改所述请求数据的状态标识。Optionally, the control module is configured to modify the status identifier of the request data.

可选地,所述控制模块,用于在判断请求数据未存储在所述存储模块后,从其他高速缓存模块获取所述请求数据。Optionally, the control module is configured to acquire the request data from other cache modules after judging that the request data is not stored in the storage module.

可选地,所述控制模块,用于在判断请求数据未存储在所述存储模块,且未从其他高速缓存模块获取到所述请求数据后,从外部主存读取所述请求数据。Optionally, the control module is configured to read the request data from the external main memory after judging that the request data is not stored in the storage module and the request data is not obtained from other cache modules.

可选地,所述控制模块,用于将所述请求数据存储至所述存储模块。Optionally, the control module is configured to store the request data in the storage module.

可选地,所述数据一致性指令包括如下的一种或多种:复制并发送所述存储模块中的数据至其他高速缓存模块的指令,将所述存储模块中数据的拷贝标记为不可用的指令,将所述存储模块中的数据写回至主存的指令。Optionally, the data consistency instruction includes one or more of the following: an instruction to copy and send the data in the storage module to other cache modules, and mark the copy of the data in the storage module as unavailable The instruction to write the data in the storage module back to the main memory.

可选地,将已经在其他高速缓存模块中被修改的数据的拷贝标记为不可用。Optionally, copies of data that have been modified in other cache modules are marked as unavailable.

可选地,所述处理单元包括多个模块;Optionally, the processing unit includes a plurality of modules;

所述控制模块,用于接收处理单元中多个模块的数据请求,对各模块的优先级进行仲裁,根据仲裁结果处理各数据请求。The control module is used for receiving data requests from multiple modules in the processing unit, arbitrating the priority of each module, and processing each data request according to the arbitration result.

可选地,所述嗅探模块,用于根据所述复制并发送所述存储模块中的数据至其他高速缓存模块的指令,复制所述存储模块中的所述请求数据,并复制的数据写入所述其他高速缓存模块中的存储模块中。Optionally, the sniffing module is configured to copy the requested data in the storage module according to the instruction to copy and send the data in the storage module to other cache modules, and write the copied data. into the storage module in the other cache modules.

本申请第二个方面,提供了一种应用于分布式处理单元的高速缓存系统,所述高速缓存系统包括:总控模块和多个高速缓存模块;In a second aspect of the present application, a cache system applied to a distributed processing unit is provided, the cache system includes: a general control module and a plurality of cache modules;

所述总控模块,用于控制各高速缓存模块之间传输数据的一致性;The general control module is used to control the consistency of data transmission between the cache modules;

所述高速缓存模块,如上述第一个方面所述的高速缓存模块。The cache module is the cache module described in the first aspect above.

可选地,所述总控模块,用于接收任一高速缓存模块的数据请求,生成复制并发送所述存储模块中的数据至其他高速缓存模块的指令,将所述复制并发送所述存储模块中的数据至其他高速缓存模块的指令发送给其他高速缓存模块。Optionally, the master control module is used to receive a data request from any cache module, generate an instruction to copy and send the data in the storage module to other cache modules, and copy and send the storage module. Instructions for data in a module to other cache modules are sent to other cache modules.

本申请提供一种应用于分布式处理单元的高速缓存模块和系统,该高速缓存模块包括:控制模块、存储模块和嗅探模块;控制模块,用于接收数据请求并对数据请求进行处理;存储模块,用于缓存数据;嗅探模块,用于接收并执行数据一致性指令。本申请提供的高速缓存模块通过控制模块和嗅探模块,保证了和MESI数据一致性协议及其变种的适配性,能够在并行计算时有效地提高数据缓存的效率。The present application provides a cache module and system applied to a distributed processing unit. The cache module includes: a control module, a storage module and a sniffing module; a control module for receiving and processing data requests; storage The module is used to cache data; the sniffing module is used to receive and execute data consistency instructions. The cache module provided by the present application ensures the adaptability with the MESI data consistency protocol and its variants through the control module and the sniffing module, and can effectively improve the efficiency of data cache during parallel computing.

附图说明Description of drawings

此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The drawings described herein are used to provide further understanding of the present application and constitute a part of the present application. The schematic embodiments and descriptions of the present application are used to explain the present application and do not constitute an improper limitation of the present application. In the attached image:

图1为本申请实施例提供的一种应用于分布式处理单元的高速缓存模块的结构示意图;1 is a schematic structural diagram of a cache module applied to a distributed processing unit according to an embodiment of the present application;

图2为本申请实施例提供的另一种应用于分布式处理单元的高速缓存模块的结构示意图;2 is a schematic structural diagram of another cache module applied to a distributed processing unit according to an embodiment of the present application;

图3为本申请实施例提供的一种应用于分布式处理单元的高速缓存系统的结构示意图;3 is a schematic structural diagram of a cache system applied to a distributed processing unit according to an embodiment of the present application;

图4为本申请实施例提供的另一种应用于分布式处理单元的高速缓存系统的结构示意图;4 is a schematic structural diagram of another cache system applied to a distributed processing unit provided by an embodiment of the present application;

图5为本申请实施例提供的另一种应用于分布式处理单元的高速缓存系统的结构示意图;5 is a schematic structural diagram of another cache system applied to a distributed processing unit provided by an embodiment of the present application;

图6为本申请实施例提供的一种总控模块的结构示意图。FIG. 6 is a schematic structural diagram of a general control module according to an embodiment of the present application.

具体实施方式Detailed ways

为了使本申请实施例中的技术方案及优点更加清楚明白,以下结合附图对本申请的示例性实施例进行进一步详细的说明,显然,所描述的实施例仅是本申请的一部分实施例,而不是所有实施例的穷举。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。In order to make the technical solutions and advantages of the embodiments of the present application more clear, the exemplary embodiments of the present application will be described in further detail below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present application, and Not all embodiments are exhaustive. It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict.

在实现本申请的过程中,发明人发现,在多核CPU和GPU的场景下,多个处理单元会有各自的高速缓存,在它们并行运算时,在主存中的同一数据可能被不同处理单元读取并修改,需要一定的机制来保证读写过程中数据在主存与各个缓存之间的一致性。在图形处理与计算领域,渲染、纹理滤波等对于计算效能与吞吐量的要求很高,需要一种适用于分布式处理单元的高速缓存方案。In the process of realizing this application, the inventor found that in the scenario of multi-core CPU and GPU, multiple processing units will have their own caches, and when they operate in parallel, the same data in the main memory may be processed by different processing units. Reading and modifying requires a certain mechanism to ensure the consistency of data between the main memory and each cache during the reading and writing process. In the field of graphics processing and computing, rendering, texture filtering, etc. have high requirements on computing performance and throughput, and a cache solution suitable for distributed processing units is required.

针对上述问题,本申请实施例中提供了一种应用于分布式处理单元的高速缓存模块和系统,该高速缓存模块包括:控制模块、存储模块和嗅探模块;控制模块,用于接收数据请求并对数据请求进行处理;存储模块,用于缓存数据;嗅探模块,用于接收并执行数据一致性指令。本申请提供的高速缓存模块通过控制模块和嗅探模块,保证了和MESI数据一致性协议及其变种的适配性,能够在并行计算时有效地提高数据缓存的效率。In view of the above problems, the embodiment of the present application provides a cache module and system applied to a distributed processing unit, the cache module includes: a control module, a storage module and a sniffing module; a control module for receiving data requests And process the data request; the storage module is used to cache data; the sniffing module is used to receive and execute data consistency instructions. The cache module provided by the present application ensures the adaptability with the MESI data consistency protocol and its variants through the control module and the sniffing module, and can effectively improve the efficiency of data cache during parallel computing.

参见图1,本实施例提供的的高速缓存模块,包括:控制模块、存储模块和嗅探模块。Referring to FIG. 1 , the cache module provided in this embodiment includes: a control module, a storage module, and a sniffing module.

1、控制模块1. Control module

控制模块,用于接收数据请求并对数据请求进行处理。The control module is used to receive and process the data request.

其中,数据请求为处理单元发送的。The data request is sent by the processing unit.

如果处理单元包括多个模块。那么每个模块均可能发送数据请求,那么控制模块,用于接收处理单元中多个模块的数据请求,对各模块的优先级进行仲裁,根据仲裁结果处理各数据请求。If the processing unit includes multiple modules. Then each module may send a data request, then the control module is used for receiving data requests from multiple modules in the processing unit, arbitrating the priority of each module, and processing each data request according to the arbitration result.

例如,按仲裁得到的优先级从高到低依次处理各模块发送的数据请求。For example, the data requests sent by each module are processed in order from high to low according to the priority obtained by arbitration.

参见图2,控制模块,具体用于接收处理单元的数据请求,根据数据请求判断请求数据是否存储在存储模块中。Referring to FIG. 2 , the control module is specifically configured to receive a data request from the processing unit, and determine whether the requested data is stored in the storage module according to the data request.

另外,控制模块在判断请求数据存储在存储模块后,还会返回一个命中信号,并从存储模块中读取请求数据,处理请求数据。在判断请求数据未存储在存储模块后,从其他高速缓存模块获取请求数据(例如向其他高速缓存模块发送复制并发送存储模块中的数据至其他高速缓存模块的指令,其他高速缓存模块基于复制并发送存储模块中的数据至其他高速缓存模块的指令发送请求数据)。在判断请求数据未存储在存储模块,且未从其他高速缓存模块获取到请求数据后,从外部主存读取请求数据。In addition, after judging that the request data is stored in the storage module, the control module will return a hit signal, read the request data from the storage module, and process the request data. After judging that the request data is not stored in the storage module, obtain the request data from other cache modules (for example, send an instruction to copy and send the data in the storage module to other cache modules to other cache modules, and other cache modules based on the copy and Instructions that send data in a memory module to other cache modules send request data). After judging that the request data is not stored in the storage module and the request data is not obtained from other cache modules, the request data is read from the external main memory.

对于请求数据未存储在存储模块的情况下,若主控模块获取到请求数据(可以从其他高速缓存模块获取,也可以从外部主存获取),则将请求数据存储至存储模块。In the case where the request data is not stored in the storage module, if the main control module obtains the request data (which can be obtained from other cache modules or from an external main memory), the request data is stored in the storage module.

此外控制模块,还用于修改请求数据的状态标识。In addition, the control module is also used to modify the status flag of the request data.

2、存储模块2. Storage module

存储模块,用于缓存数据。A storage module for caching data.

3、嗅探模块3. Sniffing module

嗅探模块,用于接收并执行数据一致性指令。The sniffing module is used to receive and execute data consistency commands.

其中,数据一致性指令包括但不限于如下的一种或多种:复制并发送存储模块中的数据至其他高速缓存模块的指令,将存储模块中数据的拷贝标记为不可用的指令,将存储模块中的数据写回至主存的指令。Wherein, the data consistency instruction includes but is not limited to one or more of the following: an instruction to copy and send data in the storage module to other cache modules, an instruction to mark the copy of the data in the storage module as an unavailable instruction, to store the An instruction to write data in the module back to main memory.

将已经在其他高速缓存模块中被修改的数据的拷贝标记为不可用。也就是说,当一个数据在其他高速缓存模块中被修改时,本高速缓存模块的嗅探模块会接收到将存储模块中数据的拷贝标记为不可用的指令。Mark copies of data that have been modified in other cache modules as unavailable. That is, when a piece of data is modified in other cache modules, the sniffing module of this cache module receives an instruction to mark the copy of the data in the storage module as unavailable.

如果嗅探模块接收到复制并发送存储模块中的数据至其他高速缓存模块的指令后,嗅探模块会根据复制并发送存储模块中的数据至其他高速缓存模块的指令,复制存储模块中的请求数据,并复制的数据写入其他高速缓存模块中的存储模块中。If the sniffing module receives an instruction to copy and send the data in the storage module to other cache modules, the sniffing module will copy the request in the storage module according to the instruction to copy and send the data in the storage module to other cache modules data, and the copied data is written to storage modules in other cache modules.

本实施例提供的一种应用于分布式处理单元的高速缓存模块包括:控制模块、存储模块和嗅探模块;控制模块,用于接收数据请求并对数据请求进行处理;存储模块,用于缓存数据;嗅探模块,用于接收并执行数据一致性指令。本实施例提供的高速缓存模块通过控制模块和嗅探模块,保证了和MESI数据一致性协议及其变种的适配性,能够在并行计算时有效地提高数据缓存的效率。A cache module applied to a distributed processing unit provided by this embodiment includes: a control module, a storage module and a sniffing module; a control module for receiving and processing data requests; a storage module for caching Data; sniffing module for receiving and executing data consistency commands. The cache module provided in this embodiment ensures the adaptability with the MESI data consistency protocol and its variants through the control module and the sniffing module, and can effectively improve the efficiency of data cache during parallel computing.

参见图3,本实施例提供一种应用于分布式处理单元的高速缓存系统,该高速缓存系统包括:总控模块和多个高速缓存模块。Referring to FIG. 3 , this embodiment provides a cache system applied to a distributed processing unit, where the cache system includes: a general control module and a plurality of cache modules.

1、总控模块1. Master control module

总控模块,用于控制各高速缓存模块之间传输数据的一致性。The master control module is used to control the consistency of data transmission among the cache modules.

具体的,总控模块,用于接收任一高速缓存模块的数据请求,生成复制并发送存储模块中的数据至其他高速缓存模块的指令,将复制并发送存储模块中的数据至其他高速缓存模块的指令发送给其他高速缓存模块。Specifically, the master control module is used to receive a data request from any cache module, generate an instruction to copy and send the data in the storage module to other cache modules, and copy and send the data in the storage module to other cache modules. instructions are sent to other cache modules.

2、高速缓存模块2. Cache module

每个高速缓存模块,均如图1所示的高速缓存模块。Each cache module is a cache module as shown in FIG. 1 .

例如,任一高速缓存模块均包括:控制模块、存储模块和嗅探模块。For example, any cache module includes: a control module, a storage module, and a sniffing module.

1)控制模块1) Control module

控制模块,用于接收数据请求并对数据请求进行处理。The control module is used to receive and process the data request.

其中,数据请求为处理单元发送的。The data request is sent by the processing unit.

如果处理单元包括多个模块。那么每个模块均可能发送数据请求,那么控制模块,用于接收处理单元中多个模块的数据请求,对各模块的优先级进行仲裁,根据仲裁结果处理各数据请求。If the processing unit includes multiple modules. Then each module may send a data request, then the control module is used for receiving data requests from multiple modules in the processing unit, arbitrating the priority of each module, and processing each data request according to the arbitration result.

例如,按仲裁得到的优先级从高到低依次处理各模块发送的数据请求。For example, the data requests sent by each module are processed in order from high to low according to the priority obtained by arbitration.

控制模块,具体用于接收处理单元的数据请求,根据数据请求判断请求数据是否存储在存储模块中。The control module is specifically configured to receive a data request from the processing unit, and determine whether the requested data is stored in the storage module according to the data request.

另外,控制模块在判断请求数据存储在存储模块后,还会返回一个命中信号,并从存储模块中读取请求数据,处理请求数据。在判断请求数据未存储在存储模块后,从其他高速缓存模块获取请求数据(例如向其他高速缓存模块发送复制并发送存储模块中的数据至其他高速缓存模块的指令,其他高速缓存模块基于复制并发送存储模块中的数据至其他高速缓存模块的指令发送请求数据)。在判断请求数据未存储在存储模块,且未从其他高速缓存模块获取到请求数据后,从外部主存读取请求数据。In addition, after judging that the request data is stored in the storage module, the control module will return a hit signal, read the request data from the storage module, and process the request data. After judging that the requested data is not stored in the storage module, obtain the requested data from other cache modules (for example, send an instruction to copy and send the data in the storage module to other cache modules to other cache modules, and other cache modules based on the copy and Instructions that send data in a memory module to other cache modules send request data). After judging that the request data is not stored in the storage module and the request data is not obtained from other cache modules, the request data is read from the external main memory.

对于请求数据未存储在存储模块的情况下,若主控模块获取到请求数据(可以从其他高速缓存模块获取,也可以从外部主存获取),则将请求数据存储至存储模块。In the case where the request data is not stored in the storage module, if the main control module obtains the request data (which can be obtained from other cache modules or from an external main memory), the request data is stored in the storage module.

此外控制模块,还用于修改请求数据的状态标识。In addition, the control module is also used to modify the status flag of the request data.

2)存储模块2) Storage module

存储模块,用于缓存数据。A storage module for caching data.

3)嗅探模块3) Sniffing module

嗅探模块,用于接收并执行数据一致性指令。The sniffing module is used to receive and execute data consistency commands.

其中,数据一致性指令包括但不限于如下的一种或多种:复制并发送存储模块中的数据至其他高速缓存模块的指令,将存储模块中数据的拷贝标记为不可用的指令,将存储模块中的数据写回至主存的指令。Wherein, the data consistency instruction includes but is not limited to one or more of the following: an instruction to copy and send data in the storage module to other cache modules, an instruction to mark the copy of the data in the storage module as an unavailable instruction, to store the An instruction to write data in the module back to main memory.

将已经在其他高速缓存模块中被修改的数据的拷贝标记为不可用。也就是说,当一个数据在其他高速缓存模块中被修改时,本高速缓存模块的嗅探模块会接收到将存储模块中数据的拷贝标记为不可用的指令。Mark copies of data that have been modified in other cache modules as unavailable. That is, when a piece of data is modified in other cache modules, the sniffing module of this cache module receives an instruction to mark the copy of the data in the storage module as unavailable.

如果嗅探模块接收到复制并发送存储模块中的数据至其他高速缓存模块的指令后,嗅探模块会根据复制并发送存储模块中的数据至其他高速缓存模块的指令,复制存储模块中的请求数据,并复制的数据写入其他高速缓存模块中的存储模块中。If the sniffing module receives an instruction to copy and send the data in the storage module to other cache modules, the sniffing module will copy the request in the storage module according to the instruction to copy and send the data in the storage module to other cache modules data, and the copied data is written to storage modules in other cache modules.

该高速缓存模块由控制模块、存储模块、嗅探模块组成,每一个分布式处理单元均含有一个所述的高速缓存模块。多个分布式处理单元通过接入同一个总控模块,以实现多个高速缓存模块的数据共享。The cache module is composed of a control module, a storage module, and a sniffing module, and each distributed processing unit contains one of the cache modules. Multiple distributed processing units are connected to the same general control module to realize data sharing among multiple cache modules.

来自处理单元的数据请求首先被送往控制模块,控制模块判断请求的数据是否在缓存中可用并根据结果执行不同操作:若数据可用,则返回一个命中信号,并指令存储模块读出该数据;若数据不可用,则从主存或其他单元的缓存中请求该数据,在获得该数据后指令读出。嗅探模块用于接收来自总控模块的指令,以实现多个高速缓存模块之间的数据共享并保证数据之间的一致性。具体来说,嗅探模块接收的指令主要包括:将本高速缓存模块中的其他高速缓存模块所请求的数据复制并发送给请求方、在其他高速缓存模块修改了某一数据时将本模块中的拷贝标记为不可用、将某一数据写回至主存等。存储模块作为缓存数据的存储单元。The data request from the processing unit is first sent to the control module, and the control module judges whether the requested data is available in the cache and performs different operations according to the result: if the data is available, a hit signal is returned, and the storage module is instructed to read the data; If the data is not available, the data is requested from the main memory or the cache of other units, and the instruction is read out after the data is obtained. The sniffing module is used for receiving instructions from the master control module, so as to realize data sharing among multiple cache modules and ensure data consistency. Specifically, the instructions received by the sniffing module mainly include: copying the data requested by other cache modules in this cache module and sending it to the requester, when other cache modules modify a certain data, copy the data in this module Mark a copy of the data as unavailable, write some data back to main memory, etc. The storage module acts as a storage unit for cached data.

高速缓存模块的服务对象包括处理单元内部需要读写数据的其他模块和其他处理单元的高速缓存模块,高速缓存模块之间在总控模块的控制下通过嗅探模块执行数据传输。The service objects of the cache module include other modules within the processing unit that need to read and write data and the cache modules of other processing units. Data transmission is performed between the cache modules through the sniffing module under the control of the general control module.

处理单元内部的其他模块根据其功能,其读写缓存数据的优先级也有区别,因此在控制模块中需要对这些模块的请求进行仲裁,优先级最高的请求通过仲裁后,控制模块根据请求的类型和请求的数据是否在存储模块中可用来修改该数据的状态标识,同时,存储模块根据请求中含有的数据地址将对应的缓存数据读出并返回一个数据可用信号。如果存储模块中不存在请求的数据,则由控制模块返回一个挂起信号,并通过总控模块向其他相连的高速缓存模块发送数据请求。Other modules inside the processing unit have different priorities for reading and writing cached data according to their functions. Therefore, the control module needs to arbitrate the requests of these modules. After the request with the highest priority passes the arbitration, the control module is based on the type of request. and whether the requested data is available in the storage module to modify the status flag of the data, and at the same time, the storage module reads out the corresponding cached data according to the data address contained in the request and returns a data available signal. If the requested data does not exist in the storage module, the control module returns a suspend signal, and sends a data request to other connected cache modules through the master control module.

例如,高速缓存模块甲需要读取的数据已经在高速缓存模块乙中缓存时,高速缓存模块甲发送的数据请求通过总控模块发送到与高速缓存模块甲相连的所有高速缓存模块中,高速缓存模块乙中的控制模块接收到请求后检查该数据是否不存在或被占用,若数据可用,则向总控模块返回信号,并通过高速缓存模块乙中的嗅探模块将高速缓存模块甲请求的数据写入至高速缓存模块甲的存储模块中。若在所有与高速缓存模块甲相连的高速缓存模块中均没有可用的请求数据,则高速缓存模块甲从外部主存读取所请求的数据并缓存到甲的存储模块中。For example, when the data to be read by the cache module A has been cached in the cache module B, the data request sent by the cache module A is sent to all the cache modules connected to the cache module A through the general control module. After receiving the request, the control module in module B checks whether the data does not exist or is occupied. If the data is available, it returns a signal to the general control module, and the sniffing module in the cache module B will cache the data requested by the cache module A. Data is written to the storage module of cache module A. If there is no available request data in all the cache modules connected to the cache module A, the cache module A reads the requested data from the external main memory and caches it in the storage module of A.

再以四个相连的高速缓存模块为例,它们与总控模块和主存的结构关系与信号如图4所示,高速缓存模块向总控模块发送每个缓存数据的状态(包括可用、已被分享、已被修改、不可用等)和数据请求,并接收总控模块的返回信号和数据请求信号。同时每个高速缓存模块通过直接存储器访问(DMA)模块与主存相连,以实现从主存中读取数据和将缓存中的数据写入主存的功能。另外,在总控模块的控制下,每个高速缓存模块可以通过自身的嗅探模块向数据共享链路发送缓存数据,以供其他有需要的高速缓存模块使用。Taking four connected cache modules as an example, their structural relationship and signals with the master control module and the main memory are shown in Figure 4. The cache module sends the state of each cached data (including available, shared, modified, unavailable, etc.) and data requests, and receive return signals and data request signals from the master control module. At the same time, each cache module is connected with the main memory through a direct memory access (DMA) module, so as to realize the functions of reading data from the main memory and writing the data in the cache into the main memory. In addition, under the control of the master control module, each cache module can send cache data to the data sharing link through its own sniffing module for use by other cache modules in need.

在高速缓存模块内部,其主要模块的结构关系与信号如图5所示,来自处理单元或总控模块的缓存数据请求经过仲裁后依次进入控制模块,由控制模块判断数据是否可用并向数据请求方发送反馈信号。若数据可用,控制模块还会发送该数据在存储模块中的地址,数据请求方可通过该地址对存储模块进行读写操作。若数据不可用,则由控制模块向总控模块发送请求,该数据最终由直接存储器访问(DMA)模块接收来自主存或其他高速缓存模块的嗅探模块发送的数据复制到存储模块中。嗅探模块接收来自总控模块的数据嗅探指令,并读取存储模块中对应的数据并输出到其他请求数据的高速缓存模块中,同时,该数据也可以写入到主存中。Inside the cache module, the structural relationship and signals of its main modules are shown in Figure 5. The cached data requests from the processing unit or the master control module enter the control module in turn after arbitration, and the control module judges whether the data is available and requests the data. The party sends a feedback signal. If the data is available, the control module will also send the address of the data in the storage module, and the data requester can read and write the storage module through the address. If the data is not available, the control module sends a request to the master control module, and the data is finally copied to the storage module by the direct memory access (DMA) module receiving the data sent by the sniffing module from the main memory or other cache modules. The sniffing module receives the data sniffing instruction from the master control module, reads the corresponding data in the storage module and outputs it to other cache modules that request data, and at the same time, the data can also be written into the main memory.

另外,总控模块的结构如图6所示,来自多个高速缓存模块的数据请求在仲裁后,将最优先的数据查找请求输入至各个高速缓存模块的嗅探模块中,同时将该请求对应的数据标识挂起直到有至少一个高速缓存模块完成了该数据的传输,在此之前,同样的数据请求由于标识被挂起不会被仲裁器优先处理。In addition, the structure of the master control module is shown in Figure 6. After the data requests from multiple cache modules are arbitrated, the highest priority data search request is input into the sniffing module of each cache module, and the request corresponds to The data flag is pending until at least one cache module completes the transfer of the data. Before that, the same data request will not be preferentially processed by the arbiter due to the flag being pending.

本实施提供的应用于分布式处理单元的高速缓存系统实现了多个高速缓存模块之间的数据共享。其总控模块、缓存控制模块与嗅探模块考虑了与常用的缓存一致性协议及其若干变种的适配性,能够在多个处理单元并行计算时有效地提高数据缓存的效率。The cache system applied to the distributed processing unit provided by this implementation realizes data sharing among multiple cache modules. Its general control module, cache control module and sniffing module take into account the adaptability with commonly used cache coherence protocols and several variants thereof, which can effectively improve the efficiency of data cache when multiple processing units perform parallel computing.

本实施例提供的应用于分布式处理单元的高速缓存系统,由总控模块和多个高速缓存模块构成,其中每个高速缓存模块均包括:控制模块、存储模块和嗅探模块;控制模块,用于接收数据请求并对数据请求进行处理;存储模块,用于缓存数据;嗅探模块,用于接收并执行数据一致性指令。通过总控模块、控制模块和嗅探模块,保证了和MESI数据一致性协议及其变种的适配性,能够在并行计算时有效地提高数据缓存的效率。The cache system applied to the distributed processing unit provided by this embodiment is composed of a general control module and a plurality of cache modules, wherein each cache module includes: a control module, a storage module and a sniffing module; a control module, It is used to receive and process data requests; the storage module is used to cache data; the sniffing module is used to receive and execute data consistency instructions. Through the master control module, the control module and the sniffing module, the adaptability with the MESI data consistency protocol and its variants is ensured, and the efficiency of data cache can be effectively improved in parallel computing.

本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。本申请实施例中的方案可以采用各种计算机语言实现,例如,面向对象的程序设计语言Java和直译式脚本语言JavaScript等。As will be appreciated by those skilled in the art, the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The solutions in the embodiments of the present application may be implemented in various computer languages, for example, the object-oriented programming language Java and the literal translation scripting language JavaScript, and the like.

本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block in the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to the processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing device to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing device produce Means for implementing the functions specified in a flow or flow of a flowchart and/or a block or blocks of a block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions The apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in the flow or blocks of the flowcharts and/or the block or blocks of the block diagrams.

尽管已描述了本申请的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请范围的所有变更和修改。While the preferred embodiments of the present application have been described, additional changes and modifications to these embodiments may occur to those skilled in the art once the basic inventive concepts are known. Therefore, the appended claims are intended to be construed to include the preferred embodiment and all changes and modifications that fall within the scope of this application.

显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present application without departing from the spirit and scope of the present application. Thus, if these modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is also intended to include these modifications and variations.

Claims (13)

1.一种应用于分布式处理单元的高速缓存模块,其特征在于,所述高速缓存模块包括:控制模块、存储模块和嗅探模块;1. a cache module applied to a distributed processing unit, wherein the cache module comprises: a control module, a storage module and a sniffing module; 所述控制模块,用于接收数据请求并对数据请求进行处理;The control module is used to receive and process the data request; 所述存储模块,用于缓存数据;The storage module is used to cache data; 所述嗅探模块,用于接收并执行数据一致性指令。The sniffing module is used for receiving and executing data consistency instructions. 2.根据权利要求1所述的高速缓存模块,其特征在于,所述控制模块,用于接收处理单元的数据请求,根据所述数据请求判断请求数据是否存储在所述存储模块中。2 . The cache module according to claim 1 , wherein the control module is configured to receive a data request from a processing unit, and determine whether the requested data is stored in the storage module according to the data request. 3 . 3.根据权利要求2所述的高速缓存模块,其特征在于,所述控制模块,用于在判断请求数据存储在所述存储模块后,返回一个命中信号,并从所述存储模块中读取所述请求数据,处理所述请求数据。3. The high-speed cache module according to claim 2, wherein the control module is used to return a hit signal after judging that the requested data is stored in the storage module, and read from the storage module the request data, and process the request data. 4.根据权利要求3所述的高速缓存模块,其特征在于,所述控制模块,用于修改所述请求数据的状态标识。4 . The cache module according to claim 3 , wherein the control module is configured to modify the status identifier of the request data. 5 . 5.根据权利要求2所述的高速缓存模块,其特征在于,所述控制模块,用于在判断请求数据未存储在所述存储模块后,从其他高速缓存模块获取所述请求数据。5 . The cache module according to claim 2 , wherein the control module is configured to acquire the request data from other cache modules after judging that the request data is not stored in the storage module. 6 . 6.根据权利要求5所述的高速缓存模块,其特征在于,所述控制模块,用于在判断请求数据未存储在所述存储模块,且未从其他高速缓存模块获取到所述请求数据后,从外部主存读取所述请求数据。6. The cache module according to claim 5, wherein the control module is used for judging that the request data is not stored in the storage module, and after the request data is not obtained from other cache modules , read the request data from external main memory. 7.根据权利要求5或6所述的高速缓存模块,其特征在于,所述控制模块,用于将所述请求数据存储至所述存储模块。7. The cache module according to claim 5 or 6, wherein the control module is configured to store the request data in the storage module. 8.根据权利要求1所述的高速缓存模块,其特征在于,所述数据一致性指令包括如下的一种或多种:复制并发送所述存储模块中的数据至其他高速缓存模块的指令,将所述存储模块中数据的拷贝标记为不可用的指令,将所述存储模块中的数据写回至主存的指令。8. The cache module according to claim 1, wherein the data consistency instruction comprises one or more of the following: an instruction to copy and send data in the storage module to other cache modules, An instruction to mark the copy of the data in the storage module as unavailable, and an instruction to write the data in the storage module back to main memory. 9.根据权利要求8所述的高速缓存模块,其特征在于,将已经在其他高速缓存模块中被修改的数据的拷贝标记为不可用。9. The cache module of claim 8, wherein copies of data that have been modified in other cache modules are marked as unavailable. 10.根据权利要求1所述的高速缓存模块,其特征在于,所述处理单元包括多个模块;10. The cache module of claim 1, wherein the processing unit comprises a plurality of modules; 所述控制模块,用于接收处理单元中多个模块的数据请求,对各模块的优先级进行仲裁,根据仲裁结果处理各数据请求。The control module is used for receiving data requests from multiple modules in the processing unit, arbitrating the priority of each module, and processing each data request according to the arbitration result. 11.根据权利要求3所述的高速缓存模块,其特征在于,所述嗅探模块,用于根据所述复制并发送所述存储模块中的数据至其他高速缓存模块的指令,复制所述存储模块中的所述请求数据,并复制的数据写入所述其他高速缓存模块中的存储模块中。11. The cache module according to claim 3, wherein the sniffing module is configured to copy the storage module according to the instruction of copying and sending data in the storage module to other cache modules the requested data in the module, and the copied data is written into the storage module in the other cache module. 12.一种应用于分布式处理单元的高速缓存系统,其特征在于,所述高速缓存系统包括:总控模块和多个高速缓存模块;12. A cache system applied to a distributed processing unit, wherein the cache system comprises: a general control module and a plurality of cache modules; 所述总控模块,用于控制各高速缓存模块之间传输数据的一致性;The general control module is used to control the consistency of data transmission between the cache modules; 所述高速缓存模块,如权利要求1-11任一权利要求所述的高速缓存模块。The cache module is the cache module according to any one of claims 1-11. 13.根据权利要求12所述的高速缓存系统,其特征在于,所述总控模块,用于接收任一高速缓存模块的数据请求,生成复制并发送所述存储模块中的数据至其他高速缓存模块的指令,将所述复制并发送所述存储模块中的数据至其他高速缓存模块的指令发送给其他高速缓存模块。13. The cache system according to claim 12, wherein the master control module is used to receive the data request of any cache module, generate copy and send the data in the storage module to other caches The instruction of the module sends the instruction of copying and sending the data in the storage module to other cache modules to other cache modules.
CN202111635416.7A 2021-12-29 2021-12-29 A cache module and system for distributed processing unit Pending CN114298892A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111635416.7A CN114298892A (en) 2021-12-29 2021-12-29 A cache module and system for distributed processing unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111635416.7A CN114298892A (en) 2021-12-29 2021-12-29 A cache module and system for distributed processing unit

Publications (1)

Publication Number Publication Date
CN114298892A true CN114298892A (en) 2022-04-08

Family

ID=80972496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111635416.7A Pending CN114298892A (en) 2021-12-29 2021-12-29 A cache module and system for distributed processing unit

Country Status (1)

Country Link
CN (1) CN114298892A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118051685A (en) * 2024-03-15 2024-05-17 海光信息技术股份有限公司 Data processing method and device, electronic device and storage medium
CN118820134A (en) * 2024-09-20 2024-10-22 北京卡普拉科技有限公司 Cache consistency optimization method in automatic thread-level parallelization

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5987571A (en) * 1996-04-24 1999-11-16 Hitachi, Ltd. Cache coherency control method and multi-processor system using the same
US20070180198A1 (en) * 2006-02-02 2007-08-02 Hitachi, Ltd. Processor for multiprocessing computer systems and a computer system
US20080215823A1 (en) * 2007-02-08 2008-09-04 Takeo Hosomi Data consistency control system and data consistency control method
CN103246614A (en) * 2012-02-08 2013-08-14 国际商业机器公司 Multiprocessor data processing system, high-speed cache memory and method thereof
CN103593306A (en) * 2013-11-15 2014-02-19 浪潮电子信息产业股份有限公司 Design method for Cache control unit of protocol processor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5987571A (en) * 1996-04-24 1999-11-16 Hitachi, Ltd. Cache coherency control method and multi-processor system using the same
US20070180198A1 (en) * 2006-02-02 2007-08-02 Hitachi, Ltd. Processor for multiprocessing computer systems and a computer system
US20080215823A1 (en) * 2007-02-08 2008-09-04 Takeo Hosomi Data consistency control system and data consistency control method
CN103246614A (en) * 2012-02-08 2013-08-14 国际商业机器公司 Multiprocessor data processing system, high-speed cache memory and method thereof
CN103593306A (en) * 2013-11-15 2014-02-19 浪潮电子信息产业股份有限公司 Design method for Cache control unit of protocol processor

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118051685A (en) * 2024-03-15 2024-05-17 海光信息技术股份有限公司 Data processing method and device, electronic device and storage medium
CN118051685B (en) * 2024-03-15 2025-02-18 海光信息技术股份有限公司 Data processing method and device, electronic equipment and storage medium
CN118820134A (en) * 2024-09-20 2024-10-22 北京卡普拉科技有限公司 Cache consistency optimization method in automatic thread-level parallelization
CN118820134B (en) * 2024-09-20 2025-02-07 北京卡普拉科技有限公司 Cache consistency optimization method in automatic thread-level parallelization

Similar Documents

Publication Publication Date Title
EP3796179B1 (en) System, apparatus and method for processing remote direct memory access operations with a device-attached memory
US11269774B2 (en) Delayed snoop for improved multi-process false sharing parallel thread performance
US9760386B2 (en) Accelerator functionality management in a coherent computing system
US12229051B2 (en) Memory management device for performing DMA operations between a main memory and a cache memory
EP4124963B1 (en) System, apparatus and methods for handling consistent memory transactions according to a cxl protocol
CN101751370B (en) System and method for maintaining cache coherency across a serial interface bus
CN114298892A (en) A cache module and system for distributed processing unit
JP2024544809A (en) SYSTEM, APPARATUS AND METHOD FOR DIRECT DATA READ FROM MEMORY - Patent application
EP3343380A1 (en) Data read method and apparatus
CN110083548B (en) Data processing method and related network element, equipment and system
CN113792006B (en) Inter-device processing system with cache coherency
TWI759397B (en) Apparatus, master device, processing unit, and method for compare-and-swap transaction
CN114356839B (en) Method, device, processor and device readable storage medium for processing write operation
JP2003345648A5 (en)
JP7606517B2 (en) System Direct Memory Access Engine Offload
CN115543201B (en) A method for accelerating core request completion in a shared memory system
JP6565729B2 (en) Arithmetic processing device, control device, information processing device, and control method for information processing device
JP7668347B2 (en) A multilevel cache coherency protocol for cache line eviction.
CN119690692A (en) Data processing method and device
CN119003410A (en) Method, device, equipment and storage medium for optimizing communication between storage controllers
CN120104549A (en) A shared memory data access method, device, equipment and storage medium
JP2008112324A (en) Data transfer method and data transfer device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination