Disclosure of Invention
In order to solve the defects described in the background art and reduce snoop operations to all processor cores, the invention provides an adaptive joint method for caching coherence directory entries, an adaptive joint system for caching coherence directory entries and a computer program product.
At least one embodiment of the present application provides an adaptive federation method for caching coherence directory entries, the method comprising:
responding to a request of checking read ownership of a current cache block by a current processor, and acquiring a first target directory entry set corresponding to an address of the current cache block from a preset directory;
under the condition that a target table entry meeting a preset idle space condition exists in the first target directory table entry set, the number information of the current processor core is stored in the target table entry;
And applying a new table entry to the preset directory under the condition that no table entry exists in the first target directory table entry set or no target table entry for storing the free space of the number information of the current processor core exists in the first target directory table entry set, storing the number information of the current processor core in the new table entry under the condition that the new table entry is successfully applied, and storing the new table entry in the first target directory table entry set.
At least one embodiment of the present application also provides an adaptive joint system for caching coherence directory entries, where the adaptive joint system is characterized by comprising:
The first target directory entry set determining module is used for responding to a request of checking the read-in ownership of the current cache block by the current processor and acquiring a first target directory entry set corresponding to the address of the current cache block from a preset directory;
The first storage module is used for storing the number information of the current processor core in the target table entry under the condition that the target table entry meeting the preset idle space condition exists in the first target directory table entry set;
The second saving module is configured to apply for a new entry to the preset directory when no entry exists in the first target directory entry set or no target entry exists in the first target directory entry set, and save the number information of the current processor core in the new entry and save the new entry in the first target directory entry set when the new entry is successfully applied.
At least one embodiment of the present application also provides an electronic device comprising a memory, a processor and a computer program stored on the memory, the processor executing the computer program to carry out the steps of the method as described above.
At least one embodiment of the present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method as described above.
At least one embodiment of the present application also provides a computer program product comprising a computer program which, when executed by a processor, implements the steps of the fiber detection method as described above.
The embodiment of the application provides a self-adaptive combination method for cache consistency directory table entries, a self-adaptive combination system for cache consistency directory table entries and a computer program product. Compared with the prior art, the method fully utilizes the capacity of the directory and can reduce interception operation on all processor cores, and because the table entries of the target directory table entry set are orderly arranged according to the preset preservation policy, the searching efficiency for the table entries in the target directory table entry set is higher.
In some alternative embodiments, the method further comprises:
responding to a request of checking the write ownership of the current cache block by the current processor, and acquiring a second target directory entry set corresponding to the address of the current cache block from the preset directory;
Determining corresponding processor cores according to the number information of the processor cores stored in each table entry in the second target directory table entry set, and sending out an invalid copy instruction so that each processor core executes an operation of invalidating the copy of the current cache block;
and sending out an entry updating instruction so that the second target directory entry set only comprises an initial directory entry, and storing the number information of the current processor core in the initial directory entry.
In some alternative embodiments, the method further comprises:
Responding to a swap-out request of the current processor core for the current cache block, and acquiring an address of the current cache block and a target directory entry storing the number information of the current processor core from the preset directory;
And deleting the number information of the current processor core from the target directory entry when the target directory entry exists.
In some optional embodiments, after the deleting the number information of the current processor core from the target directory entry, the method further comprises:
and deleting the table entry meeting the preset conditions from the target directory entry set for storing the target directory entry, wherein the preset conditions comprise that no number information of any processor core is stored in the table entry.
In some alternative embodiments, the method further comprises:
Under the condition that the application of the new table entry is unsuccessful, an entry updating instruction is sent out, so that the first target directory table entry set only comprises one initial directory table entry, the number information of the current processor core is stored in the initial directory table entry in a preset basic content format, and then the first target directory table entry set is changed into a state of inaccurate record sharing, and the directory table entry set in the state of inaccurate record sharing only comprises the unique directory table entry.
In some alternative embodiments, the target directory entry set includes:
isomorphic directory entry set and heterogeneous directory entry set, wherein:
for the table entries in the same isomorphic directory table entry set, the content formats of the table entries are the same;
for entries in the same set of heterogeneous directory entries, the content format of the entries includes one or more.
In some optional embodiments, the content format of the table entry includes a bitmap enumeration mode or a number enumeration mode, where:
the bitmap enumeration mode records ownership conditions of a plurality of processor cores with continuous processor core numbers aiming at cache block copies in a bitmap mode;
The numbering enumeration mode records ownership of a plurality of processor cores for cache block copies in a mode of processor core numbering values.
In some alternative embodiments, in the first target directory entry set and the second target directory entry set, entries are ordered according to a preset preservation policy.
In some alternative embodiments, the preset preservation policy includes:
For any two adjacent first and second entries in the target directory entry set, the value of the number information of the processor core stored in the first entry is smaller or larger than the value of the number information of the processor core stored in the second entry.
In some alternative embodiments, the system further comprises:
The second target directory entry set determining module is used for responding to the request of checking the write ownership of the current cache block by the current processor and acquiring a second target directory entry set corresponding to the address of the current cache block from the preset directory;
The invalidation module is used for determining a corresponding processor core according to the number information of the processor core stored in each table entry in the second target directory table entry set, and sending out an invalidation copy instruction so that each processor core executes the operation of invalidating the copy of the current cache block;
And the updating module is used for sending an entry updating instruction so that the second target directory entry set only comprises an initial directory entry, and the number information of the current processor core is stored in the initial directory entry.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the following detailed description of the embodiments of the present invention will be given with reference to the accompanying drawings. However, those of ordinary skill in the art will understand that in various embodiments of the present invention, numerous technical details have been set forth in order to provide a better understanding of the present invention. The claimed invention may be practiced without these specific details and with various changes and modifications based on the following embodiments.
In view of the shortcomings of the prior art, an object of an embodiment of the present invention is to provide an adaptive combining method for cache coherence directory entries, an adaptive combining system for cache coherence directory entries, and a computer program product. Compared with the prior art, the method fully utilizes the capacity of the directory and can reduce interception operation on all processor cores, and because the table entries of the target directory table entry set are orderly arranged according to the preset preservation policy, the searching efficiency for the table entries in the target directory table entry set is higher.
Embodiment one:
the embodiment of the invention relates to a self-adaptive joint method for caching consistency directory entries.
The implementation details of the adaptive combining method for cache coherence directory entries in this embodiment are specifically described from the standpoint of cache block reading, and the following is only implementation details provided for easy understanding, but is not necessary to implement this embodiment.
The self-adaptive joint method of the cache consistency directory entries of the embodiment can be applied to electronic equipment with communication, calculation and data storage capabilities. As shown in fig. 1, the adaptive combining method for cache coherence directory entries provided in this embodiment includes the following steps:
step 110, responding to a request of checking read ownership of a current cache block by a current processor, and acquiring a first target directory entry set corresponding to an address of the current cache block from a preset directory.
Specifically, a preset directory responds to an application of checking read ownership of a current cache block by a current processor, a first target directory entry set corresponding to an address of the current cache block is queried, and an entry is stored in the target directory entry set, wherein the entry is used for recording numbers of all cores of the current cache block.
Step 120, under the condition that a target table entry meeting a preset idle space condition exists in the first target directory table entry set, the number information of the current processor core is stored in the target table entry.
The preset free space condition comprises that the available free space in the table entry is not smaller than the space required for storing the number information of the current processor core.
Specifically, when an entry exists in the first target directory entry set, and there is a free space for recording the number information of the current processor core, the number information of the current processor core is recorded in the entry.
Step 130, applying a new entry to the preset directory under the condition that no entry exists in the first target directory entry set or no target entry exists in the first target directory entry set, and under the condition that the new entry is successfully applied, saving the number information of the current processor core in the new entry and saving the new entry in the first target directory entry set.
Specifically, when the first target directory entry set has no entry or no free space in which the number information of the current processor core is recorded in all entries, a new entry is applied to a directory, after a new entry is successfully applied, the new entry is added to the first target directory entry set, and the number information of the current processor core is recorded in the new entry.
Embodiment two:
based on the foregoing embodiments, implementation details of the adaptive combining method for cache coherence directory entries according to this embodiment will be specifically described from the standpoint of writing into a cache block, and the following details are provided only for convenience in understanding, and are not necessary for implementing this embodiment.
The self-adaptive joint method of the cache consistency directory entries of the embodiment can be applied to electronic equipment with communication, calculation and data storage capabilities. As shown in fig. 2, the adaptive combining method for cache coherence directory entries provided in this embodiment includes the following steps:
step 210, responding to a request of checking the write ownership of the target cache block by the current processor, and acquiring a second target directory entry set corresponding to the address of the target cache block from a preset directory.
Specifically, the preset directory responds to the application of checking the write ownership of the current cache block by the current processor, and queries a second target directory entry set corresponding to the address of the current cache block.
Step 220, determining a corresponding processor core according to the number information of the processor core stored in each entry in the second target directory entry set, and issuing an invalid copy instruction, so that each processor core executes an operation of invalidating the copy of the target cache block.
Specifically, an operation of invalidating the valid copy of the current cache block is initiated to each associated processor core according to the second set of target directory entries.
And 230, sending an entry updating instruction to enable the second target directory entry set to only contain an initial directory entry, removing redundant directory entries, and storing the number information of the current processor core in the initial directory entry.
Specifically, after the valid copy operation of the invalid current cache block, the second target directory entry set is kept only by one initial directory entry, and the number information of the current processor core is recorded in the initial directory entry.
Embodiment III:
Based on the above embodiment, the embodiment of the present invention relates to an adaptive combining method for caching coherence directory entries.
The implementation details of the adaptive combining method for cache coherence directory entries in the present embodiment are specifically described below, and the following description is merely provided for understanding the implementation details, and is not a necessity for implementing the present embodiment.
The self-adaptive joint method of the cache consistency directory entries of the embodiment can be applied to electronic equipment with communication, calculation and data storage capabilities. As shown in fig. 3, the adaptive combining method for cache coherence directory entries provided in this embodiment includes the following steps:
step 310, responding to a request of checking read ownership of a target cache block by a current processor, and acquiring a first target directory entry set corresponding to an address of the target cache block from a preset directory;
step 320, under the condition that a target table entry meeting a preset idle space condition exists in the first target directory table entry set, saving the number information of the current processor core in the target table entry;
Step 330, applying a new entry to the preset directory if no entry exists in the first target directory entry set or no target entry exists in the first target directory entry set for storing the number information of the current processor core, storing the number information of the current processor core in the new entry if the new entry is successfully applied, and storing the new entry in the first target directory entry set;
Step 340, responding to the request of the current processor for checking the write ownership of the target cache block, and acquiring a second target directory entry set corresponding to the address of the target cache block from the preset directory;
Step 350, determining a corresponding processor core according to the number information of the processor core stored in each entry in the second target directory entry set, and issuing an invalid copy instruction, so that each processor core executes an operation of invalidating the copy of the target cache block;
And 360, sending an entry updating instruction to enable the second target directory entry set to only contain an initial directory entry, removing redundant directory entries, and storing the number information of the current processor core in the initial directory entry.
As an example, the method disclosed in this embodiment may specifically include the following process flows:
the method comprises the steps that a preset directory responds to an application of checking read ownership of a current cache block by a current processor, and a first target directory entry set corresponding to an address of the current cache block is queried, wherein the entries are used for recording numbers of all cores of the current cache block;
when an entry in the first target directory entry set has an idle space for recording the number information of the current processor core, recording the number information of the current processor core in the entry;
when the first target directory entry set has no entry or no free space in which the number information of the current processor core is recorded, applying a new entry to a directory, adding the new entry to the first target directory entry set after the new entry is successfully applied, and recording the number information of the current processor core into the new entry;
The method comprises the steps that a preset directory responds to an application of checking write ownership of a current cache block by a current processor, a second target directory entry set corresponding to an address of the current cache block is queried, an operation of invalidating valid copies of the current cache block is initiated to each relevant processor core according to the second target directory entry set, then the second target directory entry set only keeps one initial directory entry, and the number information of the current processor core is recorded in the initial directory entry.
Embodiment four:
Based on the foregoing embodiments, this embodiment further explains and describes the adaptive association method of cache coherence directory entries provided in the foregoing embodiments.
In the related art, no matter how many private caches of the same cache block exist in the private caches of the processor cores, the address of the cache block can only correspond to at most one entry in the directory (in the second-level directory protocol, one entry of the same cache block in the first-level directory and an entry in the second-level directory of each core group are actually the same entry). One of the most main technical characteristics of the invention is that the address of the same cache block corresponds to an entry set in a directory, the number of entries in the set changes along with the running process of a parallel program, and all entries in the set are combined to record the sharing condition of the same cache block among all processor cores, namely, the sharing condition of a recordable part of each entry.
In step 310, a first target directory entry set corresponding to an address of a target cache block is obtained from a preset directory in response to a request of a current processor to check read ownership of the target cache block.
In some embodiments, the target directory entry set comprises:
isomorphic directory entry set and heterogeneous directory entry set, wherein:
for the table entries in the same isomorphic directory table entry set, the content formats of the table entries are the same;
for entries in the same set of heterogeneous directory entries, the content format of the entries includes one or more.
In some optional embodiments, the content format of the table entry includes a bitmap enumeration mode or a number enumeration mode, and a mutual conversion between the two modes, wherein:
the bitmap enumeration mode records ownership conditions of a plurality of processor cores with continuous processor core numbers aiming at cache block copies in a bitmap mode;
The numbering enumeration mode records ownership of a plurality of processor cores for cache block copies in a mode of processor core numbering values.
Alternatively, the content formats of all entries in the same directory entry set may be identical, which is referred to as an isomorphic directory entry set. The content formats of all entries in the same directory entry set may be multiple, which is referred to as a heterogeneous directory entry set, meaning that one directory entry may select one of multiple modalities to store the number information of the processor cores sharing the cache block. In the case that the number of processor cores in the current computing node reaches hundreds, it is difficult for one directory entry to accurately record the sharing condition of any cache block among all processor cores, and a simplified manner is often adopted.
For example, one directory entry has 64 bits (bits), where 34 bits are the address tag bits of the cache block, 10 bits record the number of currently valid copies, and two other 10 bits record the processor core numbers of the owners of the two valid copies. In the present application, the above simplified manner is referred to as a preset basic content format.
The isomorphic and heterogeneous directory entry sets are further described below:
1) Isomorphic directory entry sets based on the basic content format. Each table item in the set adopts a basic content format, the address marks of all the table items are the same, and each table item can record the processor core numbers of the owners of two effective copies, so that when N table items exist in the set, the maximum number of the owners of the effective copies which can be accurately recorded is 2*N;
2) Heterogeneous directory entry sets based on linked list structures. All entries in the collection are organized into a linked list structure or a linked structure or index structure similar to a file storage manner. Taking a linked list structure as an example, only the table entry positioned at the head of the linked list records an address mark, and each table entry can have a plurality of bit marks for marking the next table entry. When the linked list structure is a doubly linked list (the doubly linked list helps to sequence and de-redundant the entries in the set), each entry will also have several bit marks for the last entry. Generally, a directory will also employ a cache-like set associative mapping strategy to expedite a lookup according to a cache block address, while directory entries within the same set (set) will typically not be too numerous (e.g., not more than 64). Since all entries in a directory entry set come from the same group, only up to 7 bits are needed to mark the last entry or the next entry, while the remaining 50 bits of a non-linked list head entry can record the processor core number of 5 active copy owners (called number enumeration). In addition, the table entry of a non-linked list head can also record the ownership status of several check cache block copies with consecutive processor core numbers in a bitmap mode (referred to as a bitmap enumeration mode), at this time, 10 bits of the core numbers (or core group numbers) of the initial processor cores of the bitmap can be recorded in the remaining 50 bits, and the other 40 bits record the ownership status of the corresponding 40 processor check cache block copies. In one heterogeneous directory entry set, entries in a numbering enumeration mode and entries in a bitmap enumeration mode may exist simultaneously, where the numbering enumeration mode is suitable for a case where the core numbers between processor cores are sparse, and the bitmap enumeration mode is suitable for a case where the core numbers between processor cores sharing the corresponding cache blocks are dense. Along with the change of the sharing use condition of the same cache block among the processor cores in the running process of the parallel program, the table entries can be adaptively converted between a number enumeration mode and a bitmap enumeration mode, so that the accurate record of the sharing condition of the same cache block is realized by combining as few table entries as possible.
In step 320, if there is a target entry satisfying a preset free space condition in the first target directory entry set, the number information of the current processor core is stored in the target entry.
Alternatively, one directory entry may typically record the sharing of multiple processors to check the same cache block, whether in a homogenous directory entry set or a heterogeneous directory representation set. Before the number information of the current processor core is recorded in the target directory entry set, whether the list entries in the target directory entry set have the free space to record the number information of the current processor core (for example, in a number enumeration mode or a bitmap enumeration mode) should be checked, so as to improve the utilization efficiency of the directory entries.
In step 330, if no entry exists in the first target directory entry set or no target entry exists in the first target directory entry set, which is used for storing the number information of the current processor core, a new entry is applied to the preset directory, and if the new entry is successfully applied, the number information of the current processor core is stored in the new entry, and the new entry is stored in the first target directory entry set.
In some embodiments, the method further comprises:
Under the condition that the application of the new table entry is unsuccessful, an entry updating instruction is sent out, so that the first target table entry set only comprises an initial table entry and redundant table entries are removed, the number information of the current processor core is stored in the initial table entry in a preset basic content format, the first target table entry set is changed into a state of inaccurate record sharing, and the table entry set in the state of inaccurate record sharing only comprises a unique table entry.
Alternatively, it is a natural procedure to add a new entry to the target directory entry set and record the number information of the current processor core to the new entry after applying for it. The application of new entries is generally successful when there are free entries in the target directory entry set (when the directory adopts a group association mapping policy and there are free entries in the corresponding group), if there are no free entries in the directory, the application of new entries must succeed if there are no entries in the target directory entry set (i.e. the data of the cache block is read into the cache for the first time), but if there are no free entries in the directory, the application of new entries may succeed or may fail if there are entries in the target directory entry set, and the success or failure is mainly dependent on the related priority policy (similar to the cache replacement algorithm). When the application of a new table entry fails, the target directory table entry cannot accurately record the sharing condition of the cache block among all the processor cores, and the recording can be performed by adopting a basic content format, and only one table entry is always reserved in the target directory table entry set (the rest table entries are released). When the application of a new table entry causes an active table entry to be preempted, the directory table entry set to which the active table entry belongs can not accurately record the sharing condition of the cache block among all processor cores.
In step 340, in response to the request of the current processor for checking the write ownership of the target cache block, a second target directory entry set corresponding to the address of the target cache block is obtained from the preset directory.
In step 350, the corresponding processor core is determined according to the number information of the processor core stored in each entry in the second target directory entry set, and an invalid copy instruction is issued, so that each processor core executes an operation of invalidating the copy of the target cache block.
In step 360, an entry update instruction is issued to cause only one initial directory entry to be included in the second target directory entry set and to remove redundant directory entries, and the number information of the current processor core is saved in the initial directory entry.
As an example, the specific flow of the method disclosed in this embodiment includes:
the method comprises the steps that a preset directory responds to an application of checking read ownership of a current cache block by a current processor, and a first target directory entry set corresponding to an address of the current cache block is queried, wherein the entries are used for recording numbers of all cores of the current cache block;
when an entry in the first target directory entry set has an idle space for recording the number information of the current processor core, recording the number information of the current processor core in the entry;
when the first target directory entry set has no entry or no free space in which the number information of the current processor core is recorded, applying a new entry to a directory, adding the new entry to the first target directory entry set after the new entry is successfully applied, and recording the number information of the current processor core into the new entry;
The method comprises the steps that a preset directory responds to an application of checking write ownership of a current cache block by a current processor, a second target directory entry set corresponding to an address of the current cache block is queried, an operation of invalidating valid copies of the current cache block is initiated to each relevant processor core according to the second target directory entry set, then the second target directory entry set only keeps one initial directory entry, and the number information of the current processor core is recorded in the initial directory entry.
When the second target directory entry set accurately records all relevant processor cores sharing the current cache block, respectively initiating operations of invalidating valid copies of the current cache block to the relevant processor cores (corresponding to classical implementation of a cache coherence protocol, the current processor core can acquire an latest copy of the cache block while invalidating). When the second set of target directory entries does not have an exact record of all relevant processor cores sharing the current cache block, then an operation needs to be initiated to nearly all processor cores to invalidate all valid copies of the current cache block. If the second target directory entry set is an empty set, then no invalidation operation needs to be initiated and then a new directory entry needs to be applied. The write request may cause only the current processor core to have a valid copy of the current cache block among all processor cores, so the second target directory entry set eventually only needs to hold one entry.
In some embodiments, the method further comprises:
Responding to a swap-out request of the current processor core for the target cache block, and acquiring an address of the current cache block and a target directory table entry storing the number information of the current processor core from the preset directory;
And deleting the number information of the current processor core from the target directory entry when the target directory entry exists.
In some embodiments, after said deleting the number information of the current processor core from the target directory entry, the method further comprises:
And deleting the table entries meeting the preset conditions from the target directory table entry set for storing the target directory table entries.
In some embodiments, the preset conditions include:
No numbering information for any processor cores is saved in the entry.
Optionally, whether the target directory entry set accurately records the sharing condition of the current cache block or not, the relevant content in the target directory entry set needs to be updated before the current cache block is swapped out. Where a target directory entry exists (corresponding to the case of a precise record), the number information of the current processor core needs to be removed from the contents of the target directory entry. Further, the target directory entry set may be optimally adjusted, for example, at most one entry in the set has free space, an entry without substantial content is removed, or a certain entry is converted between a bitmap enumeration mode and a numbering enumeration mode.
In some embodiments, in the first target directory entry set and the second target directory entry set, entries are ordered according to a preset preservation policy.
In some embodiments, the preset preservation policy includes:
For any two adjacent first and second entries in the target directory entry set, the value of the number information of the processor core stored in the first entry is smaller or larger than the value of the number information of the processor core stored in the second entry.
Alternatively, if there are multiple entries in the set, it is necessary to find an entry related to the number information of the current processor core from all the entries in the set. If there is no order relationship between entries, it is often necessary to traverse all entries. In order to accelerate the search process, the entries may always be in a certain order, such as the processor core numbers are arranged in ascending or descending order, and after the contents of the set change, the order is maintained by adjusting the contents between the entries.
Fifth embodiment:
another embodiment of the present application is directed to an adaptive joint system for caching coherence directory entries.
The implementation details of the adaptive joint system for cache coherence directory entries of the present embodiment are specifically described from the perspective of cache block reading, and the following is only implementation details provided for easy understanding, but is not necessary for implementing the present embodiment, where the adaptive joint system for cache coherence directory entries provided in the present embodiment includes:
The first target directory entry set determining module is used for responding to a request of checking read ownership of a target cache block by a current processor and acquiring a first target directory entry set corresponding to an address of the target cache block from a preset directory;
The first storage module is used for storing the number information of the current processor core in the target table entry under the condition that the target table entry meeting the preset idle space condition exists in the first target directory table entry set;
The second saving module is configured to apply for a new entry to the preset directory when no entry exists in the first target directory entry set or no target entry exists in the first target directory entry set, and save the number information of the current processor core in the new entry and save the new entry in the first target directory entry set when the new entry is successfully applied.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working processes of each module in the adaptive joint system of cache coherence directory entries may refer to corresponding processes in the foregoing method embodiments, and this embodiment is not repeated herein.
It should be noted that, each module involved in this embodiment is a logic module, and in practical application, one logic unit may be one physical unit, or may be a part of one physical unit, or may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present application, units less closely related to solving the technical problem presented by the present application are not introduced in the present embodiment, but it does not indicate that other units are not present in the present embodiment.
Example six:
based on the foregoing embodiments, another embodiment of the present application relates to an adaptive joint system for caching coherence directory entries.
The implementation details of the adaptive joint system for cache coherence directory entries of the present embodiment are specifically described below from the standpoint of writing into a cache block, and the following is only implementation details provided for easy understanding, but is not necessary for implementing the present embodiment, where the adaptive joint system for cache coherence directory entries provided in the present embodiment includes:
the second target directory entry set determining module is used for responding to a request of checking the write ownership of the target cache block by the current processor and acquiring a second target directory entry set corresponding to the address of the target cache block from a preset directory;
The invalidation module is used for determining a corresponding processor core according to the number information of the processor core stored in each table entry in the second target directory table entry set, and sending out an invalidation copy instruction so that each processor core executes the operation of invalidating the copy of the target cache block;
And the updating module is used for sending an entry updating instruction so that the second target directory entry set only comprises an initial directory entry, redundant directory entries are removed, and the number information of the current processor core is stored in the initial directory entry.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working processes of each module in the adaptive joint system of cache coherence directory entries may refer to corresponding processes in the foregoing method embodiments, and this embodiment is not repeated herein.
It should be noted that, each module involved in this embodiment is a logic module, and in practical application, one logic unit may be one physical unit, or may be a part of one physical unit, or may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present application, units less closely related to solving the technical problem presented by the present application are not introduced in the present embodiment, but it does not indicate that other units are not present in the present embodiment.
Embodiment seven:
based on the foregoing embodiments, another embodiment of the present application relates to an adaptive joint system for caching coherence directory entries.
The implementation details of the adaptive joint system for cache coherence directory entries of the present embodiment are specifically described below, which are provided only for easy understanding, but not necessary for implementing the present embodiment, where the adaptive joint system for cache coherence directory entries provided in the present embodiment includes:
The first target directory entry set determining module is used for responding to a request of checking read ownership of a target cache block by a current processor and acquiring a first target directory entry set corresponding to an address of the target cache block from a preset directory;
The first storage module is used for storing the number information of the current processor core in the target table entry under the condition that the target table entry meeting the preset idle space condition exists in the first target directory table entry set;
a second saving module, configured to apply a new entry to the preset directory if no entry exists in the first target directory entry set or no target entry exists in the first target directory entry set for saving the number information of the current processor core, and save the number information of the current processor core in the new entry and save the new entry in the first target directory entry set if the new entry is successfully applied;
The second target directory entry set determining module is used for responding to the request of checking the write ownership of the target cache block by the current processor and acquiring a second target directory entry set corresponding to the address of the target cache block from the preset directory;
The invalidation module is used for determining a corresponding processor core according to the number information of the processor core stored in each table entry in the second target directory table entry set, and sending out an invalidation copy instruction so that each processor core executes the operation of invalidating the copy of the target cache block;
And the updating module is used for sending an entry updating instruction so that the second target directory entry set only comprises an initial directory entry, redundant directory entries are removed, and the number information of the current processor core is stored in the initial directory entry.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working processes of each module in the adaptive joint system of cache coherence directory entries may refer to corresponding processes in the foregoing method embodiments, and this embodiment is not repeated herein.
It should be noted that, each module involved in this embodiment is a logic module, and in practical application, one logic unit may be one physical unit, or may be a part of one physical unit, or may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present application, units less closely related to solving the technical problem presented by the present application are not introduced in the present embodiment, but it does not indicate that other units are not present in the present embodiment.
Example eight:
Another embodiment of the application is directed to an electronic device comprising at least one processor and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the embodiments described above.
Where the memory and the processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting the various circuits of the one or more processors and the memory together. The bus may also connect various other circuits such as peripherals, voltage regulators, and power management circuits, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or may be a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over the wireless medium via the antenna, which further receives the data and transmits the data to the processor.
The processor is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory may be used to store data used by the processor in performing operations.
Example nine:
another embodiment of the application relates to a computer-readable storage medium storing a computer program. The computer program implements the above-described method embodiments when executed by a processor.
That is, it will be understood by those skilled in the art that all or part of the steps in implementing the methods of the embodiments described above may be implemented by a program stored in a storage medium, where the program includes several instructions for causing a device (which may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps in the methods of the embodiments of the application. The storage medium includes various media capable of storing program codes, such as a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk.
In some embodiments of the application, a computer program product is also provided, comprising a computer program which, when executed by a processor, implements the steps of the method described in the above embodiments.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples of carrying out the application and that various changes in form and details may be made therein without departing from the spirit and scope of the application.