CN103902470B - Read processing method, equipment and the system during missing - Google Patents
Read processing method, equipment and the system during missing Download PDFInfo
- Publication number
- CN103902470B CN103902470B CN201210571969.5A CN201210571969A CN103902470B CN 103902470 B CN103902470 B CN 103902470B CN 201210571969 A CN201210571969 A CN 201210571969A CN 103902470 B CN103902470 B CN 103902470B
- Authority
- CN
- China
- Prior art keywords
- processor
- cache line
- read transaction
- cache
- bus
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Memory System Of A Hierarchy Structure (AREA)
Abstract
本发明提供一种读缺失时的处理方法、设备和系统。该方法包括第一处理器产生地址信息,所述地址信息中包含Cache Tag;所述第一处理器在确定出存在第一Cache Line时,获取所述第一Cache Line中记录的第二处理器的信息,所述第一Cache Line的Tag与所述Cache Tag数值相同,且状态位指示为无效状态;所述第一处理器根据所述第二处理器的信息,向所述第二处理器单播发送总线读事务,使得所述第二处理器在存储有效的所述第一CacheLine的数据副本时,提供所述第一Cache Line的数据。本发明实施例可以降低读缺失时的功耗开销。
The invention provides a processing method, device and system for reading loss. The method comprises that the first processor generates address information, and the address information includes a Cache Tag; when the first processor determines that the first Cache Line exists, obtains the second processor recorded in the first Cache Line information, the Tag of the first Cache Line is the same as the value of the Cache Tag, and the status bit indicates an invalid state; the first processor sends the information to the second processor according to the information of the second processor and unicast sending the bus read transaction, so that the second processor provides the data of the first Cache Line when storing a valid copy of the data of the first Cache Line. The embodiment of the present invention can reduce the power consumption overhead during read misses.
Description
技术领域technical field
本发明涉及存储技术,尤其涉及一种读缺失时的处理方法、设备和系统。The present invention relates to storage technology, in particular to a processing method, device and system for reading loss.
背景技术Background technique
高速缓冲存储器(Cache)是在计算机存储系统的层次结构中,介于处理器和存储器之间的高速小容量存储器。Cache一般由很多高速缓冲存储器线(Cache Line)构成,每条Cache Line是Cache中的一个独立的条目。Cache memory (Cache) is a high-speed small-capacity memory between the processor and the memory in the hierarchical structure of the computer storage system. Cache generally consists of many cache memory lines (Cache Line), and each Cache Line is an independent entry in Cache.
在读操作时,Cache会接收地址信息,并将地址信息中的高速缓冲寄存器标签(Cache Tag)与Cache Line中的标签(Tag)进行比较,当不存在与Cache Tag对应的CacheLine,或者存在但该Cache Line为无效时,就发生了读缺失。During the read operation, the Cache will receive the address information and compare the cache register tag (Cache Tag) in the address information with the tag (Tag) in the Cache Line. When there is no CacheLine corresponding to the Cache Tag, or there is but the When the Cache Line is invalid, a read miss occurs.
在发生读缺失后,该Cache会通过总线向总线上所有的Cache广播总线读事务,该总线读事务中包含缺失的地址信息,接收到总线读事务的Cache也会进行Tag比较,并在存在该地址信息对应的有效数据时,对该总线读事务进行响应,向缺失的Cache提供该有效数据。After a read miss occurs, the Cache will broadcast the bus read transaction to all the Cache on the bus through the bus. The bus read transaction contains the missing address information. When there is valid data corresponding to the address information, respond to the bus read transaction and provide the valid data to the missing Cache.
从上述描述可以看出,当发生上述读缺失时,现有技术是采用广播的方式,总线上的所有接收到总线读事务的Cache都需要进行Tag比较,功耗开销较大。It can be seen from the above description that when the above-mentioned read miss occurs, the prior art adopts a broadcast method, and all caches on the bus that receive the bus read transaction need to perform Tag comparison, which consumes a lot of power consumption.
发明内容Contents of the invention
有鉴于此,本发明实施例提供了一种读缺失时的处理方法、设备和系统,用以解决现有技术中存在的读缺失的功耗开销较大的问题。In view of this, embodiments of the present invention provide a processing method, device, and system for read misses, so as to solve the problem of high power consumption of read misses in the prior art.
第一方面,提供了一种读缺失时的处理方法,包括:In the first aspect, a processing method for reading loss is provided, including:
第一处理器产生地址信息,所述地址信息中包含Cache Tag;The first processor generates address information, where the address information includes a Cache Tag;
所述第一处理器在确定出存在第一Cache Line时,获取所述第一CacheLine中记录的第二处理器的信息,所述第一Cache Line的Tag与所述CacheTag数值相同,且状态位指示为无效状态;When the first processor determines that there is a first Cache Line, it obtains the information of the second processor recorded in the first Cache Line, the Tag of the first Cache Line is the same as the value of the CacheTag, and the status bit indicates an invalid state;
所述第一处理器根据所述第二处理器的信息,向所述第二处理器单播发送总线读事务,使得所述第二处理器在存储有效的所述第一Cache Line的数据副本时,提供所述第一Cache Line的数据。The first processor unicasts the bus read transaction to the second processor according to the information of the second processor, so that the second processor is storing a valid data copy of the first Cache Line , provide the data of the first Cache Line.
结合第一方面,在第一方面的第一种可能的实现方式中,还包括:In combination with the first aspect, the first possible implementation of the first aspect further includes:
所述第一处理器在所述第一Cache Line的状态位为无效状态后,在所述第一Cache Line中记录第二处理器的信息,所述第二处理器为最近一次存储所述第一CacheLine的数据副本的处理器。After the status bit of the first Cache Line is invalid, the first processor records the information of the second processor in the first Cache Line, and the second processor stores the information of the first Cache Line last time. Processor for a CacheLine of data copies.
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述第二处理器的信息记录在所述第一Cache Line的数据区。With reference to the first aspect or the first possible implementation of the first aspect, in the second possible implementation of the first aspect, the information of the second processor is recorded in the data area of the first Cache Line .
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第三种可能的实现方式中,还包括:In combination with the first aspect or the first possible implementation of the first aspect, the third possible implementation of the first aspect further includes:
所述第一处理器接收所述第二处理器发送的总线读事务响应,其中,当所述第二处理器存储有效的所述第一Cache Line的数据副本时所述总线读事务响应中包含所述第一Cache Line的数据,或者,当所述第一处理器没有存储有效的所述第一Cache Line的数据副本时,所述总线读事务响应用于指示所第一处理器在总线上广播发送总线读事务;或者,The first processor receives the bus read transaction response sent by the second processor, wherein when the second processor stores a valid data copy of the first Cache Line, the bus read transaction response includes The data of the first Cache Line, or, when the first processor does not store a valid copy of the data of the first Cache Line, the bus read transaction response is used to indicate that the first processor is on the bus broadcast send bus read transaction; or,
所述第一处理器在设定时间内没有收到所述第二处理器发送的总线读事务响应,则在总线上广播发送总线读事务,所述读事务响应在所述第二处理器没有存储有效的所述第一Cache Line的数据副本时不发送。If the first processor does not receive the bus read transaction response sent by the second processor within the set time, it broadcasts and sends the bus read transaction on the bus, and the read transaction response is not received by the second processor. It is not sent when storing the valid data copy of the first Cache Line.
第二方面,提供了一种读缺失时的处理方法,包括:In the second aspect, a processing method for reading loss is provided, including:
第二处理器接收第一处理器单播发送的总线读事务,所述总线读事务是所述第一处理器根据第一Cache Line中记录的第二处理器的信息发送的,所述第一Cache Line的Tag与所述第一处理器产生的地址信息中的Cache Tag数值相同,且状态位指示为无效状态;The second processor receives the bus read transaction sent by the first processor unicast, the bus read transaction is sent by the first processor according to the information of the second processor recorded in the first Cache Line, and the first processor The Tag of the Cache Line is the same as the Cache Tag value in the address information generated by the first processor, and the status bit indicates an invalid status;
所述第二处理器在存储有效的所述第一Cache Line的数据副本时,向所述第一处理器提供所述第一Cache Line的数据。The second processor provides the data of the first Cache Line to the first processor when storing a valid copy of the data of the first Cache Line.
结合第二方面,在第二方面的第一种可能的实现方式中,还包括:In combination with the second aspect, the first possible implementation manner of the second aspect further includes:
所述第二处理器在没有存储有效的所述第一Cache Line的数据副本时,向所述第一处理器发送用于指示所述第一处理器在总线上广播总线读事务的总线读事务响应;或者,When the second processor does not store a valid data copy of the first Cache Line, it sends a bus read transaction for instructing the first processor to broadcast a bus read transaction on the bus to the first processor response; or,
所述第二处理器在没有存储有效的所述第一Cache Line的数据副本时,不发送总线读事务响应,使得所述第一处理器在设定时间内没有收到所述第二处理器发送的总线读事务响应,在总线上广播发送总线读事务。When the second processor does not store a valid data copy of the first Cache Line, it does not send a bus read transaction response, so that the first processor does not receive the second processor within a set time The sent bus read transaction response broadcasts the sent bus read transaction on the bus.
第三方面,提供了一种读缺失时的处理设备,包括:In the third aspect, a processing device for reading loss is provided, including:
产生模块,用于产生地址信息,所述地址信息中包含Cache Tag;A generating module, configured to generate address information, where the address information includes a Cache Tag;
获取模块,用于在确定出存在第一Cache Line时,获取所述第一CacheLine中记录的第二处理器的信息,所述第一Cache Line的Tag与所述CacheTag数值相同,且状态位指示为无效状态;An acquisition module, configured to acquire the information of the second processor recorded in the first CacheLine when it is determined that there is the first Cache Line, the Tag of the first Cache Line is the same as the CacheTag value, and the status bit indicates is invalid;
发送模块,用于根据所述第二处理器的信息,向所述第二处理器单播发送总线读事务,使得所述第二处理器在存储有效的所述第一Cache Line的数据副本时,提供所述第一Cache Line的数据。A sending module, configured to unicast send a bus read transaction to the second processor according to the information of the second processor, so that when the second processor stores a valid data copy of the first Cache Line , providing the data of the first Cache Line.
结合第三方面,在第三方面的第一种可能的实现方式中,还包括:In combination with the third aspect, the first possible implementation of the third aspect further includes:
记录模块,用于在所述第一Cache Line的状态位为无效状态后,在所述第一CacheLine中记录第二处理器的信息,所述第二处理器为最近一次存储所述第一Cache Line的数据副本的处理器。A recording module, configured to record the information of the second processor in the first CacheLine after the state bit of the first Cache Line is in an invalid state, and the second processor stores the information of the first Cache for the last time Handler for Line's data copy.
结合第三方面或第三方面的第一种可能的实现方式,在第三方面的第二种可能的实现方式中,所述第二处理器的信息记录在所述第一Cache Line的数据区。With reference to the third aspect or the first possible implementation of the third aspect, in a second possible implementation of the third aspect, the information of the second processor is recorded in the data area of the first Cache Line .
结合第三方面或第三方面的第一种可能的实现方式,在第三方面的第三种可能的实现方式中,还包括:In combination with the third aspect or the first possible implementation manner of the third aspect, the third possible implementation manner of the third aspect further includes:
接收模块,用于接收所述第二处理器发送的总线读事务响应,其中,当所述第二处理器存储有效的所述第一Cache Line的数据副本时所述总线读事务响应中包含所述第一Cache Line的数据,或者,当所述第一处理器没有存储有效的所述第一Cache Line的数据副本时,所述总线读事务响应用于指示所第一处理器在总线上广播发送总线读事务;或者,A receiving module, configured to receive the bus read transaction response sent by the second processor, wherein when the second processor stores a valid copy of the data of the first Cache Line, the bus read transaction response includes the The data of the first Cache Line, or, when the first processor does not store a valid copy of the data of the first Cache Line, the bus read transaction response is used to instruct the first processor to broadcast on the bus send a bus read transaction; or,
广播模块,用于在设定时间内没有收到所述第二处理器发送的总线读事务响应,则在总线上广播发送总线读事务,所述读事务响应在所述第二处理器没有存储有效的所述第一Cache Line的数据副本时不发送。The broadcast module is used to broadcast and send the bus read transaction on the bus if the bus read transaction response sent by the second processor is not received within the set time, and the read transaction response is not stored in the second processor A valid data copy of the first Cache Line is not sent.
第四方面,提供了一种读缺失时的处理设备,包括:In the fourth aspect, a processing device for reading loss is provided, including:
接收模块,用于接收第一处理器单播发送的总线读事务,所述总线读事务是所述第一处理器根据第一Cache Line中记录的第二处理器的信息发送的,所述第一Cache Line的Tag与所述第一处理器产生的地址信息中的CacheTag数值相同,且状态位指示为无效状态;The receiving module is configured to receive the bus read transaction sent by the first processor in unicast, the bus read transaction is sent by the first processor according to the information of the second processor recorded in the first Cache Line, the second processor The Tag of a Cache Line is the same as the CacheTag value in the address information generated by the first processor, and the status bit indicates an invalid state;
发送模块,用于在存储有效的所述第一Cache Line的数据副本时,向所述第一处理器提供所述第一Cache Line的数据。A sending module, configured to provide the first processor with the data of the first Cache Line when storing a valid copy of the data of the first Cache Line.
结合第四方面,在第四方面的第一种可能的实现方式中,所述发送模块还用于:With reference to the fourth aspect, in a first possible implementation manner of the fourth aspect, the sending module is further configured to:
在没有存储有效的所述第一Cache Line的数据副本时,向所述第一处理器发送用于指示所述第一处理器在总线上广播总线读事务的总线读事务响应;或者,When there is no valid data copy of the first Cache Line stored, sending a bus read transaction response for instructing the first processor to broadcast a bus read transaction on the bus to the first processor; or,
在没有存储有效的所述第一Cache Line的数据副本时,不发送总线读事务响应,使得所述第一处理器在设定时间内没有收到所述第二处理器发送的总线读事务响应,在总线上广播发送总线读事务。When there is no effective data copy of the first Cache Line stored, the bus read transaction response is not sent, so that the first processor does not receive the bus read transaction response sent by the second processor within the set time , broadcast a bus read transaction on the bus.
第五方面,提供了一种读缺失时的处理系统,包括:In the fifth aspect, a system for processing missing reads is provided, including:
第三方面任一种设备;以及,any device of the third aspect; and,
第四方面任一种设备。Any device of the fourth aspect.
通过上述技术方案,在Cache Line中记录最近一次存储其副本的处理器的信息,在读缺失时根据该处理器的信息向对应的处理器单播发送总线广播事务,可以避免广播发送方式引起的问题,降低功耗开销。Through the above technical solution, the information of the processor that stored its copy last time is recorded in the Cache Line, and when the read is missing, the bus broadcast transaction is sent to the corresponding processor by unicast according to the information of the processor, which can avoid the problems caused by the broadcast transmission method , reducing power overhead.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the following will briefly introduce the drawings that need to be used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1为本发明实施例中的Cache的基本结构示意图;Fig. 1 is the basic structural representation of the Cache in the embodiment of the present invention;
图2为本发明读缺失时的处理方法一实施例的流程示意图;Fig. 2 is a schematic flow chart of an embodiment of a processing method when reading a deletion in the present invention;
图3为本发明读缺失时的处理方法另一实施例的流程示意图;Fig. 3 is a schematic flow chart of another embodiment of the processing method for reading deletions in the present invention;
图4为本发明读缺失时的处理方法另一实施例的流程示意图;Fig. 4 is a schematic flow chart of another embodiment of the processing method for reading deletions in the present invention;
图5为本发明读缺失时的处理方法另一实施例的流程示意图;FIG. 5 is a schematic flow diagram of another embodiment of the processing method for reading deletions in the present invention;
图6为本发明实施例中本地处理器存储器读写操作引起的Cache Line的MSI状态变化图;Fig. 6 is the MSI state change diagram of Cache Line caused by local processor memory read and write operation in the embodiment of the present invention;
图7为本发明实施例中总线上的处理器存储器读写操作引起的CacheLine的MSI状态变化图;Fig. 7 is the MSI state change diagram of the CacheLine caused by the processor memory read and write operation on the bus in the embodiment of the present invention;
图8为本发明读缺失时的处理设备一实施例的结构示意图;FIG. 8 is a schematic structural diagram of an embodiment of a processing device for read misses in the present invention;
图9为本发明读缺失时的处理设备另一实施例的结构示意图;FIG. 9 is a schematic structural diagram of another embodiment of the processing device for read misses in the present invention;
图10为本发明读缺失时的处理设备另一实施例的结构示意图;FIG. 10 is a schematic structural diagram of another embodiment of a processing device for read misses in the present invention;
图11为本发明读缺失时的处理设备另一实施例的结构示意图;FIG. 11 is a schematic structural diagram of another embodiment of the processing device for read misses in the present invention;
图12为本发明读缺失时的处理系统一实施例的结构示意图。FIG. 12 is a schematic structural diagram of an embodiment of a processing system for read-misses in the present invention.
具体实施方式detailed description
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
如图1所示,为Cache的基本结构示意图,Cache的地址信息包括高速缓冲存储器标签(Cache Tag)、高速缓冲存储器索引(Cache index)和偏移量(offset)三个部分。其中,Cache index用于索引某条Cache Line在Cache中的位置。一条Cache Line包括状态位(Valid bit)、标签(Tag)和数据(Data)三部分。假设Cache拥有w路(way),那么一个Cacheindex可以同时得到w个Cache Line,这w个Cache Line组成一个组(Cache set)。这时就需要将Cache Tag与Cache Line中的Tag比较,若数值相等则Cache命中,通过多路选择器(MUX)输出该命中的Cache Line的数据。其中,由于Cache Tag与Cache Line中的Tag位数可能不同,因此在比较时Cache Tag可以经过旁路转换缓冲(Translation LookasideBuffer,TLB)转换为与Cache Line中的Tag位数一致以进行比较两者数值是否相同。As shown in Figure 1, it is a schematic diagram of the basic structure of the Cache. The address information of the Cache includes three parts: a cache tag (Cache Tag), a cache index (Cache index) and an offset (offset). Among them, the Cache index is used to index the position of a certain Cache Line in the Cache. A Cache Line includes three parts: status bit (Valid bit), tag (Tag) and data (Data). Assuming that the Cache has w ways (way), then a Cache index can obtain w Cache Lines at the same time, and these w Cache Lines form a group (Cache set). At this time, it is necessary to compare the Cache Tag with the Tag in the Cache Line. If the values are equal, the Cache hits, and the data of the hit Cache Line is output through a multiplexer (MUX). Among them, since the number of bits in the Tag in the Cache Tag and the Cache Line may be different, the Cache Tag can be converted to be consistent with the number of bits in the Tag in the Cache Line through a bypass translation buffer (Translation LookasideBuffer, TLB) during comparison. whether the values are the same.
Cache能够大量减少单个处理器对存储器带宽的要求,如果单个处理器对存储器的带宽要求减少了,多个处理器就能共享同一个存储器。共享存储器系统支持共享和私有数据的缓存,私有数据被单个处理器使用,而共享数据被多个处理器使用。因为两个不同的处理器所存储的存储器视图是通过各自的Cache得到的,如果没有其他的防范措施,当其中一个处理器在其Cache中更改了共享数据的值,而另外一个处理器仍保留原共享数据时,则导致两个处理器分别得到两个不同的值,这就是Cache一致性问题。Cache can greatly reduce the memory bandwidth requirement of a single processor. If the memory bandwidth requirement of a single processor is reduced, multiple processors can share the same memory. Shared memory systems support caching of both shared and private data, with private data being used by a single processor and shared data being used by multiple processors. Because the memory views stored by two different processors are obtained through their own Cache, if there are no other precautions, when one of the processors changes the value of the shared data in its Cache, the other processor still retains it. When the original data is shared, the two processors get two different values respectively, which is the Cache consistency problem.
在多处理器系统中实现写无效协议的关键是使用总线或其他广播媒介来完成无效操作。要实现无效操作,处理器只要取得总线控制权然后在总线上广播无效数据的地址即可(总线写无效事务)。所有的处理器都要不断地监听总线来监测地址。处理器要检查各自的Cache中是否有总线上广播的地址。如果有,则Cache中相应的数据要置为无效。The key to implementing the write invalidation protocol in a multiprocessor system is to use the bus or other broadcast medium to complete the invalidation operation. To realize the invalid operation, the processor only needs to obtain the control right of the bus and then broadcast the address of invalid data on the bus (bus write invalid transaction). All processors constantly snoop the bus to monitor addresses. The processors check whether there is an address broadcast on the bus in their respective Cache. If there is, the corresponding data in the Cache shall be invalidated.
当一个处理器发生读缺失时,它会在总线上广播缺失数据的地址(总线读事务)。所有的处理器都要不断地监听总线来监测地址。处理器要检查各自的Cache中是否有总线上广播的地址。如果有,则对总线读事务做出响应,提供这个Cache Line。When a processor misses a read, it broadcasts the address of the missing data on the bus (a bus read transaction). All processors constantly snoop the bus to monitor addresses. The processors check whether there is an address broadcast on the bus in their respective Cache. If so, respond to the bus read transaction and provide this Cache Line.
当一个处理器发生写缺失时,它会在总线上广播缺失数据的地址(总线互斥读事务)。所有的处理器都要不断地监听总线来监测地址。处理器要检查各自的Cache中是否有总线上广播的地址。如果有,则对总线互斥读事务做出响应,提供这个Cache Line,并将本地副本置为无效。When a processor misses a write, it broadcasts the address of the missing data on the bus (bus exclusive read transaction). All processors constantly snoop the bus to monitor addresses. The processors check whether there is an address broadcast on the bus in their respective Cache. If so, respond to the bus exclusive read transaction, provide this Cache Line, and invalidate the local copy.
不仅是上述的无效性协议,现有的一致性协议在发生Cache读缺失时也是需要广播总线读事务。为了解决广播引起的功耗开销较大的问题,本发明给出如下实施例。Not only the above-mentioned invalidation protocol, but also the existing coherence protocol also needs to broadcast the bus read transaction when a Cache read miss occurs. In order to solve the problem of high power consumption caused by broadcasting, the present invention provides the following embodiments.
图2为本发明读缺失时的处理方法一实施例的流程示意图,包括:Fig. 2 is a schematic flow chart of an embodiment of a processing method for reading deletions in the present invention, including:
21:第一处理器产生地址信息,所述地址信息中包含高速缓冲存储器标签(CacheTag);21: The first processor generates address information, where the address information includes a cache memory tag (CacheTag);
22:所述第一处理器在确定出存在第一高速缓冲存储器线(Cache Line)时,获取所述第一Cache Line中的第二处理器的信息,所述第一Cache Line的标签(Tag)与所述Cache Tag的数值相同,且状态位指示为无效状态;22: When the first processor determines that there is a first cache memory line (Cache Line), obtain the information of the second processor in the first cache line, and the tag (Tag ) is the same as the value of the Cache Tag, and the status bit indicates an invalid status;
23:所述第一处理器根据所述第二处理器的信息,向所述第二处理器单播发送总线读事务,使得所述第二处理器在存储有效的所述第一Cache Line的数据副本时,向所述第一处理器提供所述第一Cache Line的数据。23: The first processor unicasts the bus read transaction to the second processor according to the information of the second processor, so that the second processor stores the valid first Cache Line When copying data, provide the data of the first Cache Line to the first processor.
上述的第一处理器可以是指任意一个要发起总线读事务的处理器。The aforementioned first processor may refer to any processor that initiates a bus read transaction.
可选的,如图3所示,在读取第一Cache Line中记录的处理器的信息之前,以在21之前为例,该方法还可以包括:Optionally, as shown in FIG. 3, before reading the information of the processor recorded in the first Cache Line, taking before 21 as an example, the method may also include:
31:所述第一处理器在第一Cache Line的状态位为无效状态后,在所述第一CacheLine中记录第二处理器的信息,所述第二处理器为最近一次存储所述第一Cache Line的数据副本的处理器。31: After the status bit of the first Cache Line is invalid, the first processor records the information of the second processor in the first Cache Line, and the second processor stores the information of the first Cache Line last time. The processor of the data copy of the Cache Line.
可选的,第二处理器的信息可以是记录在第一Cache Line的数据区(data)。Optionally, the information of the second processor may be recorded in a data area (data) of the first Cache Line.
相应的,参见图4,第二处理器执行的流程可以如下:Correspondingly, referring to FIG. 4, the process executed by the second processor may be as follows:
41:第二处理器接收第一处理器单播发送的总线读事务,所述总线读事务是所述第一处理器根据第一Cache Line中记录的第二处理器的信息发送的,所述第一Cache Line的Tag与所述第一处理器产生的地址信息中的CacheTag的数值相同,且状态位指示为无效状态;41: The second processor receives the bus read transaction sent by the first processor in unicast, the bus read transaction is sent by the first processor according to the information of the second processor recorded in the first Cache Line, and the The Tag of the first Cache Line has the same value as the CacheTag in the address information generated by the first processor, and the status bit indicates an invalid status;
42:所述第二处理器在存储有效的所述第一Cache Line的数据副本时,向所述第一处理器提供所述第一Cache Line的数据。42: The second processor provides the data of the first Cache Line to the first processor when storing a valid copy of the data of the first Cache Line.
本实施例通过在Cache Line中记录最近一次存储其副本的处理器的信息,在读缺失时根据该处理器的信息向对应的处理器单播发送总线广播事务,可以避免广播发送方式引起的问题,降低功耗开销。In this embodiment, by recording the information of the processor that stored its copy last time in the Cache Line, and sending the bus broadcast transaction to the corresponding processor by unicast according to the information of the processor when the read is missing, the problems caused by the broadcast transmission mode can be avoided. Reduce power overhead.
图5为本发明读缺失时的处理方法另一实施例的流程示意图,包括:Fig. 5 is a schematic flow chart of another embodiment of the processing method of the present invention when reading a deletion, including:
51:第一处理器产生读地址。51: The first processor generates a read address.
该读地址的结构可以参见图1,包括Cache Tag、Cache index和offset。The structure of the read address can be seen in Figure 1, including Cache Tag, Cache index and offset.
52:第一处理器根据该读地址的索引(Cache index)找到相应的Cache Set。52: The first processor finds the corresponding Cache Set according to the index (Cache index) of the read address.
例如,参见图1,一个Cache index可以找到w条Cache Line,该w条Cache Line组成一个Cache Set。For example, referring to FIG. 1, a Cache index can find w Cache Lines, and the w Cache Lines form a Cache Set.
53:第一处理器判断是否存在匹配的Tag,若是,执行54,否则,执行59。53: The first processor judges whether there is a matching Tag, and if yes, executes 54; otherwise, executes 59.
可以将读地址中的Cache Tag与Cache Set中的每条Cache Line的Tag进行比较,如果存在数值相同的Cache Set和Tag,就表明存储匹配的Tag,否则不存在。The Cache Tag in the read address can be compared with the Tag of each Cache Line in the Cache Set. If there is a Cache Set and Tag with the same value, it indicates that the matching Tag is stored, otherwise it does not exist.
54:第一处理器判断Tag匹配的Cache line状态位是否有效,若是,执行55,否则执行56。54: The first processor judges whether the Cache line status bit matched by the Tag is valid, and if so, executes 55; otherwise, executes 56.
其中,当一个Cache Line的Tag与Cache Tag相同时,该Cache Line就是Tag匹配的Cache Line。Wherein, when the Tag of a Cache Line is the same as the Cache Tag, the Cache Line is the Cache Line that the Tag matches.
如图1所示,Cache Line包括状态位(Valid bit),该状态位用于表明是否为有效状态,例如该Valid bit=1,则表明有效,当Valid bit=0则表明无效。As shown in Figure 1, the Cache Line includes a status bit (Valid bit), which is used to indicate whether it is in a valid state. For example, if the Valid bit=1, it indicates that it is valid, and when the Valid bit=0, it indicates that it is invalid.
55:读命中,之后第一处理器可以将该数据读出。55: A read hit, and then the first processor can read out the data.
56:从该Tag匹配的Cache line的数据区读取处理器的编号,该处理器可以称为第二处理器,是最近一次存储该Cache Line的数据副本的处理器,并向该第二处理器单播发送总线读事务。56: Read the number of the processor from the data area of the Cache line that the Tag matches. This processor can be called the second processor, which is the processor that stored the data copy of the Cache Line last time, and sends the data to the second processor. The device unicasts the bus read transaction.
其中,对于Cache Line,当其状态位表明其为无效后,例如Valid bit=0后,该Cache Line的数据区中除了记录写入的数据之外,还需要记录修改该Cache Line数据的处理器的信息,并且在处理器每修改一次就更新一次。当一个处理器在修改Cache Line的数据时,可以首先将该Cache Line的数据存储在该处理器中,也就是该处理器中会存储该Cache Line的数据副本。Among them, for a Cache Line, when its status bit indicates that it is invalid, for example, after Valid bit=0, in addition to recording the written data in the data area of the Cache Line, it is also necessary to record the processor that modifies the Cache Line data information and is updated every time the processor modifies it. When a processor modifies the data of the Cache Line, it may first store the data of the Cache Line in the processor, that is, a copy of the data of the Cache Line will be stored in the processor.
由于每次修改Cache Line中都会记录相应处理器的信息,因此,该CacheLine的数据区可以记录最近一次修改其数据的处理器的信息,也就是最近一次存储其数据副本的处理器的信息,例如处理器的编号。Since the information of the corresponding processor is recorded in each modification of the Cache Line, the data area of the CacheLine can record the information of the processor that modified its data last time, that is, the information of the processor that stored its data copy last time, for example The number of the processor.
57:第二处理器接收到总线读事务后,判断是否包含有效的副本,该副本是对应Cache Line的数据副本,若是,执行58,否则执行59。57: After receiving the bus read transaction, the second processor judges whether there is a valid copy, the copy is a data copy corresponding to the Cache Line, if yes, go to 58, otherwise go to 59.
其中,接收到总线读事务的处理器也可以首先比较Cache Tag,找到匹配的CacheLine后,如果该Cache Line的状态位为有效,则表明包含有效的对应Cache Line的副本,否则没有有效的副本。Wherein, the processor that receives the bus read transaction can also compare the Cache Tag first, and after finding a matching CacheLine, if the status bit of the Cache Line is valid, it indicates that there is a valid copy of the corresponding Cache Line, otherwise there is no valid copy.
58:第二处理器向第一处理器提供Cache Line数据。58: The second processor provides the Cache Line data to the first processor.
第二处理器可以向第一处理器发送总线读事务响应,该响应中携带有效的对应Cache Line的数据副本,也就是提供了Cache Line数据。The second processor may send a bus read transaction response to the first processor, where the response carries a valid data copy corresponding to the Cache Line, that is, provides Cache Line data.
59:第一处理器广播发送总线读事务。也就是执行现有的Cache读缺失流程。59: The first processor broadcasts the send bus read transaction. That is, the existing Cache read-miss process is executed.
其中,该广播的总线读事务可以是第一处理器判断出不存在匹配的Tag时,直接广播的;或者,Wherein, the broadcasted bus read transaction may be directly broadcasted when the first processor determines that there is no matching Tag; or,
也可以是,当第二处理器中不存在有效的Cache Line的数据副本时,第二处理器向第一处理器发送总线读事务响应,该响应表明第二处理器中不存在有效的Cache Line的数据副本或其它指示信息,以触发第一处理器广播总线读事务。或者,It may also be that, when there is no valid Cache Line data copy in the second processor, the second processor sends a bus read transaction response to the first processor, which indicates that there is no valid Cache Line in the second processor copy of the data or other indication information to trigger the first processor to broadcast the bus read transaction. or,
也可以是,当第二处理器中不存在有效的Cache Line的数据副本时,第二处理器不进行响应,当第一处理器在设定的时间内没有收到响应时,就广播发送总线读事务。It can also be that when there is no valid data copy of the Cache Line in the second processor, the second processor does not respond, and when the first processor does not receive a response within the set time, it broadcasts the sending bus read transaction.
经过上述处理,下面以修改共享无效(Modified Shared Invalid,MSI)协议为例说明Cache的状态变化情况。After the above processing, the following uses the Modified Shared Invalid (MSI) protocol as an example to illustrate the state changes of the Cache.
写无效协议是指在处理器写数据之前保证该处理器能独占地访问数据项,它在执行写操作时要使其它副本无效。MSI协议是一种写无效协议,它利用三个状态来区别CacheLine的状态,分别为:The write invalidation protocol is to ensure that the processor has exclusive access to the data item before the processor writes the data, and it invalidates other copies when performing the write operation. The MSI protocol is a write invalidation protocol, which uses three states to distinguish the state of the CacheLine, which are:
M:已修改状态。该状态表明:该Cache Line已被处理器修改过,其中的数据是系统中唯一正确的数据,主存中的相应数据是过时的,其它Cache中该Cache Line的副本也都是无效的。M: Modified status. This state indicates that the Cache Line has been modified by the processor, the data in it is the only correct data in the system, the corresponding data in the main memory is outdated, and the copies of the Cache Line in other Cache are also invalid.
S:共享状态。该状态表明:其它Cache中也可能有该Cache Line的有效副本。S: shared state. This state indicates that there may be valid copies of the Cache Line in other Caches.
I:无效状态。该状态表明:该Cache Line中的数据无效。I: Invalid state. This status indicates: the data in this Cache Line is invalid.
其中,当Cache Line处于M或S状态时,相应的状态位(valid bit)的数值是1,表明处于有效状态;当Cache Line处于I状态时,相应的状态位(validbit)的数值是0,表明处于无效状态。Among them, when the Cache Line is in the M or S state, the value of the corresponding status bit (valid bit) is 1, indicating that it is in a valid state; when the Cache Line is in the I state, the value of the corresponding status bit (valid bit) is 0, Indicates an invalid state.
Cache Line的MSI状态除了受到本地处理器存储器读写操作的影响之外,还受到共享总线上其它处理器存储器读写操作的影响。The MSI state of the Cache Line is not only affected by the read and write operations of the memory of the local processor, but also by the read and write operations of the memory of other processors on the shared bus.
参见图6,给出了本地处理器存储器读写操作对Cache Line的MSI状态的影响。其中,读命中时,Cache Line的状态只可能是M或S状态,此时不必发送总线事务,由Cache提供数据,Cache Line的状态不变。Referring to FIG. 6 , it shows the influence of the local processor memory read and write operations on the MSI state of the Cache Line. Among them, when a read is hit, the state of the Cache Line can only be in the M or S state. At this time, there is no need to send a bus transaction, and the data is provided by the Cache, and the state of the Cache Line remains unchanged.
读缺失时,可能是因为要访问的数据不在Cache中,也就是Tag不匹配时,或者,读缺失也可能是因为要访问的数据在Cache中但处于无效状态,此时需要发送总线读事务,取得数据并将Cache Line的状态置为S状态。When the read is missing, it may be because the data to be accessed is not in the Cache, that is, when the Tag does not match, or the read miss may also be because the data to be accessed is in the Cache but is in an invalid state. At this time, a bus read transaction needs to be sent. Get data and set the state of Cache Line to S state.
写命中时,若命中的是处于M状态的Cache Line,则直接更新该CacheLine,CacheLine的状态不变;若写命中处于S状态的Cache Line,则需发送总线写无效事务广播,使其他Cache中该数据块的副本无效,然后更新该CacheLine,Cache Line的状态变为M状态。When a write hits, if the hit is a Cache Line in the M state, the CacheLine is directly updated, and the state of the CacheLine remains unchanged; if the write hits a Cache Line in the S state, it is necessary to send a bus write invalid transaction broadcast to make other caches The copy of the data block is invalid, and then the CacheLine is updated, and the state of the Cache Line changes to the M state.
写缺失时,可能是因为要访问的数据不在Cache中,也可能是因为要访问的数据在Cache中,但处于无效状态,此时需要发送总线互斥读事务广播,取得数据并更新,将CacheLine置为M状态。When writing is missing, it may be because the data to be accessed is not in the Cache, or it may be because the data to be accessed is in the Cache but is in an invalid state. At this time, it is necessary to send a bus exclusive read transaction broadcast to obtain and update the data, and set the CacheLine Set to M state.
与现有技术不同的是,现有技术不论是上述两种哪种读缺失,都会广播总线读事务,而本发明实施例中,当要访问的数据在Cache中但处于无效状态引起的读缺失时,将单播总线读事务。单播的目的处理器就是上述的CacheLine的数据区记录的处理器。Different from the prior art, the prior art will broadcast the bus read transaction regardless of which of the above two read misses, but in the embodiment of the present invention, when the data to be accessed is in the Cache but is in an invalid state, the read miss , the bus read transaction will be unicast. The destination processor of the unicast is the processor recorded in the data area of the CacheLine mentioned above.
参见图7,给出了总线上的处理器存储器读写操作对Cache Line的MSI状态的影响。Referring to FIG. 7 , it shows the influence of the processor memory read and write operations on the bus on the MSI state of the Cache Line.
当本地Cache中不包含有总线读事务或总线写事务所指定的Cache Line时或虽然包含但该Cache Line处于I状态时,本地Cache不作出响应或者作出响应以触发发送总线读事务的处理器广播总线读事务。When the local Cache does not contain the Cache Line specified by the bus read transaction or the bus write transaction, or although it is included but the Cache Line is in the I state, the local Cache does not respond or respond to trigger the broadcast of the processor that sends the bus read transaction Bus read transaction.
当本地Cache中包含有总线读事务所指定的Cache Line且该Cache Line处于S状态时,本地Cache将响应总线读事务并提供数据。When the local Cache contains the Cache Line specified by the bus read transaction and the Cache Line is in the S state, the local Cache will respond to the bus read transaction and provide data.
当本地Cache中包含有总线读事务所指定的Cache Line且该Cache Line处于M状态时,本地Cache将响应总线读事务提供数据,然后将本地Cache Line置为S状态。When the local Cache contains the Cache Line specified by the bus read transaction and the Cache Line is in the M state, the local Cache will provide data in response to the bus read transaction, and then set the local Cache Line to the S state.
当本地Cache中包含有总线写无效事务所指定的Cache Line时,将该CacheLine状态置为无效。When the local Cache contains the Cache Line specified by the bus write invalid transaction, the state of the Cache Line is set to be invalid.
当本地Cache中包含有总线互斥读事务所指定的Cache Line且该CacheLine处于M状态时,本地Cache将响应总线互斥读事务提供数据,然后将本地Cache Line置为I状态。When the local Cache contains the Cache Line specified by the bus mutually exclusive read transaction and the CacheLine is in the M state, the local Cache will provide data in response to the bus mutually exclusive read transaction, and then set the local Cache Line to the I state.
与现有技术不同的是,上述的总线读事务是被单播发送的,而不是现有技术中的广播发送方式。Different from the prior art, the above-mentioned bus read transaction is sent by unicast instead of the broadcast sending method in the prior art.
上面以MSI协议为例,现有技术中的修改排外共享无效(ModifiedExclusiveShared Invalid,MESI)协议、修改占有排外共享无效(ModifiedOwned Exclusive SharedInvalid MOESI)协议等一致性协议也同样适用于上述在Tag匹配但数据无效引起的读缺失时,采用单播发送的方式。Taking the MSI protocol above as an example, the existing technologies such as the Modified Exclusive Shared Invalid (MESI) protocol and the Modified Owned Exclusive Shared Invalid (MOESI) protocol are also applicable to the above-mentioned tag matching but data In the case of read misses caused by invalidity, unicast transmission is adopted.
图8为本发明读缺失时的处理设备一实施例的结构示意图,该设备可以为上述的第一处理器,该设备80包括产生模块81、获取模块82和单播模块83;产生模块81用于产生地址信息,所述地址信息中包含Cache Tag;获取模块82用于在确定出存在第一Cache Line时,获取所述第一Cache Line中记录的第二处理器的信息,所述第一Cache Line的Tag与所述Cache Tag数值相同,且状态位指示为无效状态;单播模块83用于根据所述第二处理器的信息,向所述第二处理器单播发送总线读事务,使得所述第二处理器在存储有效的所述第一Cache Line的数据副本时,提供所述第一Cache Line的数据。FIG. 8 is a schematic structural diagram of an embodiment of a processing device when reading is missing in the present invention. The device can be the above-mentioned first processor. The device 80 includes a generation module 81, an acquisition module 82 and a unicast module 83; the generation module 81 uses When generating address information, the address information includes Cache Tag; the obtaining module 82 is configured to obtain the information of the second processor recorded in the first Cache Line when it is determined that the first Cache Line exists, and the first The Tag of the Cache Line is identical to the value of the Cache Tag, and the status bit indicates an invalid state; the unicast module 83 is used to unicast and send the bus read transaction to the second processor according to the information of the second processor, The second processor is configured to provide the data of the first Cache Line when storing a valid copy of the data of the first Cache Line.
可选的,该设备还可以包括:Optionally, the device can also include:
记录模块,用于在所述第一Cache Line的状态位为无效状态后,在所述第一CacheLine中记录第二处理器的信息,所述第二处理器为最近一次存储所述第一Cache Line的数据副本的处理器。A recording module, configured to record the information of the second processor in the first CacheLine after the state bit of the first Cache Line is in an invalid state, and the second processor stores the information of the first Cache for the last time Handler for Line's data copy.
可选的,所述第二处理器的信息记录在所述第一Cache Line的数据区。Optionally, the information of the second processor is recorded in the data area of the first Cache Line.
可选的,该设备还可以包括:Optionally, the device can also include:
接收模块,用于接收所述第二处理器发送的总线读事务响应,其中,当所述第二处理器存储有效的所述第一Cache Line的数据副本时所述总线读事务响应中包含所述第一Cache Line的数据,或者,当所述第一处理器没有存储有效的所述第一Cache Line的数据副本时,所述总线读事务响应用于指示所第一处理器在总线上广播发送总线读事务;或者,A receiving module, configured to receive the bus read transaction response sent by the second processor, wherein when the second processor stores a valid copy of the data of the first Cache Line, the bus read transaction response includes the The data of the first Cache Line, or, when the first processor does not store a valid copy of the data of the first Cache Line, the bus read transaction response is used to instruct the first processor to broadcast on the bus send a bus read transaction; or,
广播模块,用于在设定时间内没有收到所述第二处理器发送的总线读事务响应,则在总线上广播发送总线读事务,所述读事务响应在所述第二处理器没有存储有效的所述第一Cache Line的数据副本时不发送。The broadcast module is used to broadcast and send the bus read transaction on the bus if the bus read transaction response sent by the second processor is not received within the set time, and the read transaction response is not stored in the second processor A valid data copy of the first Cache Line is not sent.
参见图9,为本发明实施例提供的另一种设备的结构示意图,该设备可以为上述的第一处理器,该第一处理器90包括收发器91、CPU 92、高速缓冲存储器(Cache)93、以及与上述各模块连接的总线94,总线94可以包括数据总线、地址总线、状态总线等。CPU用于根据通过收发器及总线接收的用户的输入产生地址信息,所述地址信息中包含Cache Tag;并在确定出存在第一Cache Line时,获取所述第一Cache Line中记录的第二处理器的信息,所述第一Cache Line的Tag与所述Cache Tag数值相同,且状态位指示为无效状态。CPU还可以通过收发器以及总线根据所述第二处理器的信息,向所述第二处理器单播发送总线读事务,使得所述第二处理器在存储有效的所述第一Cache Line的数据副本时,提供所述第一Cache Line的数据。其中,CacheLine是位于Cache中的。可以理解的是,处理器中还可以包括其他模块,如算术逻辑单元、时钟发生器、比较器、定时器、复位电路、调制器等。Referring to FIG. 9 , it is a schematic structural diagram of another device provided by an embodiment of the present invention. The device may be the above-mentioned first processor, and the first processor 90 includes a transceiver 91, a CPU 92, and a cache memory (Cache) 93, and a bus 94 connected to the above-mentioned modules, the bus 94 may include a data bus, an address bus, a status bus, and the like. The CPU is used to generate address information according to user input received through the transceiver and the bus, where the address information includes a Cache Tag; For the information of the processor, the Tag of the first Cache Line is the same as the value of the Cache Tag, and the status bit indicates an invalid status. The CPU may also unicast the bus read transaction to the second processor through the transceiver and the bus according to the information of the second processor, so that the second processor stores the valid first Cache Line When copying data, provide the data of the first Cache Line. Among them, CacheLine is located in the Cache. It can be understood that the processor may also include other modules, such as an arithmetic logic unit, a clock generator, a comparator, a timer, a reset circuit, a modulator, and the like.
需要说明的是,图8和图9所示的设备可以用于实现以上方法实施例所提供的关于第一处理器的任一种方法,且相关术语描述以及相关实现方式同以上方法实施例,在此不再赘述。It should be noted that the devices shown in FIG. 8 and FIG. 9 can be used to implement any method related to the first processor provided by the above method embodiments, and the description of related terms and related implementation methods are the same as the above method embodiments, I won't repeat them here.
本实施例通过在Cache Line中记录最近一次存储其副本的处理器的信息,在读缺失时根据该处理器的信息向对应的处理器单播发送总线广播事务,可以避免广播发送方式引起的问题,降低功耗开销。In this embodiment, by recording the information of the processor that stored its copy last time in the Cache Line, and sending the bus broadcast transaction to the corresponding processor by unicast according to the information of the processor when the read is missing, the problems caused by the broadcast transmission mode can be avoided. Reduce power overhead.
图10为本发明读缺失时的处理设备另一实施例的结构示意图,该设备可以为上述的第二处理器,该设备100包括接收模块101和发送模块102;接收模块101用于接收第一处理器单播发送的总线读事务,所述总线读事务是所述第一处理器根据第一Cache Line中记录的第二处理器的信息发送的,所述第一Cache Line的Tag与所述第一处理器产生的地址信息中的Cache Tag数值相同,且状态位指示为无效状态;发送模块102用于在存储有效的所述第一Cache Line的数据副本时,向所述第一处理器提供所述第一Cache Line的数据。FIG. 10 is a schematic structural diagram of another embodiment of the processing device for read misses in the present invention. The device may be the above-mentioned second processor. The device 100 includes a receiving module 101 and a sending module 102; the receiving module 101 is used to receive the first A bus read transaction sent by a processor unicast, the bus read transaction is sent by the first processor according to the information of the second processor recorded in the first Cache Line, the Tag of the first Cache Line is the same as the The Cache Tag values in the address information generated by the first processor are the same, and the status bit indicates an invalid state; the sending module 102 is configured to send to the first processor when storing a valid copy of the data of the first Cache Line Provide the data of the first Cache Line.
可选的,所述发送模块还用于:Optionally, the sending module is also used for:
在没有存储有效的所述第一Cache Line的数据副本时,向所述第一处理器发送用于指示所述第一处理器在总线上广播总线读事务的总线读事务响应;或者,When there is no valid data copy of the first Cache Line stored, sending a bus read transaction response for instructing the first processor to broadcast a bus read transaction on the bus to the first processor; or,
在没有存储有效的所述第一Cache Line的数据副本时,不发送总线读事务响应,使得所述第一处理器在设定时间内没有收到所述第二处理器发送的总线读事务响应,在总线上广播发送总线读事务。When there is no effective data copy of the first Cache Line stored, the bus read transaction response is not sent, so that the first processor does not receive the bus read transaction response sent by the second processor within the set time , broadcast a bus read transaction on the bus.
参见图11,为本发明实施例提供的另一种设备的结构示意图,该设备可以为上述的第二处理器,该第二处理器110包括收发器111、CPU 112、高速缓冲存储器(Cache)113、以及与上述各模块连接的总线114,总线114可以包括数据总线、地址总线、状态总线等。收发器用于接收第一处理器单播发送的总线读事务,所述总线读事务是所述第一处理器根据第一Cache Line中记录的第二处理器的信息发送的,所述第一Cache Line的Tag与所述第一处理器产生的地址信息中的Cache Tag数值相同,且状态位指示为无效状态;以及,用于在通过CPU判断出Cache中存储有效的所述第一Cache Line的数据副本时,向所述第一处理器提供所述第一Cache Line的数据。可以理解的是,处理器中还可以包括其他模块,如算术逻辑单元、时钟发生器、比较器、定时器、复位电路、调制器等。Referring to FIG. 11 , it is a schematic structural diagram of another device provided by an embodiment of the present invention. The device may be the above-mentioned second processor, and the second processor 110 includes a transceiver 111 , a CPU 112 , and a cache memory (Cache) 113, and a bus 114 connected to the above-mentioned modules, the bus 114 may include a data bus, an address bus, a status bus, and the like. The transceiver is used to receive the bus read transaction sent by the first processor in unicast, the bus read transaction is sent by the first processor according to the information of the second processor recorded in the first Cache Line, and the first Cache Line The Tag of Line is the same as the Cache Tag value in the address information generated by the first processor, and the status bit indicates an invalid state; When copying data, provide the data of the first Cache Line to the first processor. It can be understood that the processor may also include other modules, such as an arithmetic logic unit, a clock generator, a comparator, a timer, a reset circuit, a modulator, and the like.
需要说明的是,图10和图11所示的设备可以用于实现以上方法实施例所提供的关于第二处理器的任一种方法,且相关术语描述以及相关实现方式同以上方法实施例,在此不再赘述。It should be noted that the devices shown in FIG. 10 and FIG. 11 can be used to implement any method related to the second processor provided in the above method embodiments, and the description of related terms and related implementation methods are the same as the above method embodiments, I won't repeat them here.
本实施例通过在Cache Line中记录最近一次存储其副本的处理器的信息,在读缺失时根据该处理器的信息向对应的处理器单播发送总线广播事务,可以避免广播发送方式引起的问题,降低功耗开销。In this embodiment, by recording the information of the processor that stored its copy last time in the Cache Line, and sending the bus broadcast transaction to the corresponding processor by unicast according to the information of the processor when the read is missing, the problems caused by the broadcast transmission mode can be avoided. Reduce power overhead.
图12为本发明读缺失时的处理系统一实施例的结构示意图,该系统120包括第一处理器121和第二处理器122;第一处理器121可以参见图8或图9。第二处理器122可以参见图10或图11。FIG. 12 is a schematic structural diagram of an embodiment of a system for processing read misses according to the present invention. The system 120 includes a first processor 121 and a second processor 122 ; refer to FIG. 8 or FIG. 9 for the first processor 121 . Refer to FIG. 10 or FIG. 11 for the second processor 122 .
本实施例通过在Cache Line中记录最近一次存储其副本的处理器的信息,在读缺失时根据该处理器的信息向对应的处理器单播发送总线广播事务,可以避免广播发送方式引起的问题,降低功耗开销。In this embodiment, by recording the information of the processor that stored its copy last time in the Cache Line, and sending the bus broadcast transaction to the corresponding processor by unicast according to the information of the processor when the read is missing, the problems caused by the broadcast transmission mode can be avoided. Reduce power overhead.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of description, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned function allocation can be completed by different functional modules according to needs. The internal structure of the device is divided into different functional modules to complete all or part of the functions described above. For the specific working process of the above-described system, device, and unit, reference may be made to the corresponding process in the foregoing method embodiments, and details are not repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be Incorporation may either be integrated into another system, or some features may be omitted, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk, and other media that can store program codes. .
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present application, and are not intended to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still understand the foregoing The technical solutions described in each embodiment are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the application.
Claims (13)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210571969.5A CN103902470B (en) | 2012-12-25 | 2012-12-25 | Read processing method, equipment and the system during missing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210571969.5A CN103902470B (en) | 2012-12-25 | 2012-12-25 | Read processing method, equipment and the system during missing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103902470A CN103902470A (en) | 2014-07-02 |
CN103902470B true CN103902470B (en) | 2017-10-24 |
Family
ID=50993804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210571969.5A Active CN103902470B (en) | 2012-12-25 | 2012-12-25 | Read processing method, equipment and the system during missing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103902470B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105677581A (en) * | 2016-01-05 | 2016-06-15 | 上海斐讯数据通信技术有限公司 | Internal storage access device and method |
CN114416606A (en) * | 2021-12-24 | 2022-04-29 | 北京奕斯伟计算技术有限公司 | Cache processing method, apparatus, device, readable storage medium and program product |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1851673A (en) * | 2005-12-13 | 2006-10-25 | 华为技术有限公司 | Processor system and its data operating method |
CN101008921A (en) * | 2007-01-26 | 2007-08-01 | 浙江大学 | Embedded heterogeneous polynuclear cache coherence method based on bus snooping |
CN202563494U (en) * | 2011-10-09 | 2012-11-28 | 西安交通大学 | Consistency maintenance device for multi-core processor |
-
2012
- 2012-12-25 CN CN201210571969.5A patent/CN103902470B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1851673A (en) * | 2005-12-13 | 2006-10-25 | 华为技术有限公司 | Processor system and its data operating method |
CN101008921A (en) * | 2007-01-26 | 2007-08-01 | 浙江大学 | Embedded heterogeneous polynuclear cache coherence method based on bus snooping |
CN202563494U (en) * | 2011-10-09 | 2012-11-28 | 西安交通大学 | Consistency maintenance device for multi-core processor |
Also Published As
Publication number | Publication date |
---|---|
CN103902470A (en) | 2014-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9665486B2 (en) | Hierarchical cache structure and handling thereof | |
US7363462B2 (en) | Performing virtual to global address translation in processing subsystem | |
US8176257B2 (en) | Cache used both as cache and staging buffer | |
CN111742301B (en) | Log cache inflows to higher level caches by request | |
US8205045B2 (en) | Satisfying memory ordering requirements between partial writes and non-snoop accesses | |
TWI431475B (en) | Apparatus, system and method for memory mirroring and migration at home agent | |
JP4737691B2 (en) | Exclusive ownership snoop filter | |
US9792210B2 (en) | Region probe filter for distributed memory system | |
US9170946B2 (en) | Directory cache supporting non-atomic input/output operations | |
US7765381B2 (en) | Multi-node system in which home memory subsystem stores global to local address translation information for replicating nodes | |
CN109684237B (en) | Data access method and device based on multi-core processor | |
JP3661764B2 (en) | Method and system for providing an eviction protocol in a non-uniform memory access computer system | |
US20090006668A1 (en) | Performing direct data transactions with a cache memory | |
Dey et al. | Design and implementation of a simple cache simulator in Java to investigate MESI and MOESI coherency protocols | |
EP1611513B1 (en) | Multi-node system in which global address generated by processing subsystem includes global to local translation information | |
CN104346295B (en) | A cache refresh method and device | |
CN103902470B (en) | Read processing method, equipment and the system during missing | |
US9836398B2 (en) | Add-on memory coherence directory | |
US10489292B2 (en) | Ownership tracking updates across multiple simultaneous operations | |
JP2000267935A (en) | Cache memory device | |
US20080140942A1 (en) | Implementing a hot coherency state to a cache coherency protocol in a symmetric multi-processor environment | |
TW202503501A (en) | Snoop filter with disaggregated vector table | |
Jalil et al. | Proposal New Cache Coherence Protocol to Optimize CPU Time through Simulation Caches |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |