Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present specification. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present description as detailed in the accompanying claims.
The terminology used in the description presented herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the description. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in this specification to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information, without departing from the scope of the present description. The term "if" as used herein may be interpreted as "at..once" or "when..once" or "in response to a determination", depending on the context.
The CPU core may typically include an MMU (Memory Management Unit ) that may translate virtual addresses to physical addresses required for memory access and may store the mapping between virtual addresses and physical addresses in the TLB of the CPU core.
When an application program running on a CPU core performs memory access, the MMU can firstly inquire a physical address corresponding to the virtual address in the TLB, and then perform memory access based on the inquired physical address. If the TLB does not store the physical address corresponding to the virtual address, the MMU needs to search the physical address corresponding to the virtual address based on a process page table of the memory.
To isolate virtual addresses used by different applications, ASID (ADDRESS SPACE ID, address space identifier) is introduced, and after an application is initialized, the operating system may generate ASID for the application and bind the ASID to an application Process Identification (PID). ASID is different for different applications, so that the same virtual address can be used for different applications. The TLB stores a mapping relationship among the ASID, the virtual address, and the physical address.
Where a process is a carrier of application runs, one application run typically corresponds to one process. The thread is the minimum unit for executing operation in the process, and is included in the process and is the actual operation unit in the process. A process may typically include multiple threads that share the memory space of an application. When a process or a thread accesses a memory, an ASID corresponding to the process is used, and the MMU can search a physical address corresponding to the ASID and the virtual address in the TLB based on the ASID and the virtual address.
For multi-core CPUs, there are often many scenarios running across the CPU cores.
For example, a process of a certain application program is scheduled by a scheduler from CPU core 1 to CPU core 2.
For another example, multiple threads of a process run in different CPU cores.
In these scenarios running across CPU cores, they may share the same memory space because they are all processes or threads of the same application. However, in the scenario of running across CPU cores, the MMU of each CPU core needs to search the physical address corresponding to the virtual address based on the process page table of the memory, and then store the mapping relationship into the TLB, which results in repeated process page table search, wastes processing resources of the CPU core, and affects the IO performance.
The present specification provides a memory access scheme of a computer system, which can improve IO performance under a cross-CPU core operation scene and save processing resources of the CPU core.
Fig. 1 is a flow chart illustrating a memory access method according to an exemplary embodiment of the present disclosure.
Referring to fig. 1, the memory access method can be used for memory access of a computer system, the computer system includes a CPU, the CPU includes a plurality of CPU cores, the CPU cores include TLBs, the method is applied to the CPU cores, for example, MMU applicable to the CPU cores, and includes the following steps:
Step 102, in response to a memory access request, searching a physical address corresponding to a virtual address carried by the memory access request in a TLB.
In this specification, a process or a thread running on a CPU core may initiate a memory access request (hereinafter, both described as a thread initiating a memory access request because the thread is an actual operation unit in the process), where the memory access request typically carries a virtual address that needs to be accessed. The ASID of the application program is also carried in the memory access request in the CPU supporting the ASID, and the present specification will describe taking payment ASID as an example.
In response to the memory access request, the MMU of the CPU core may first query the TLB of the CPU core, and query whether the TLB stores the physical address corresponding to the virtual address. For example, the TLB is queried for a physical address corresponding to the ASID and the virtual address specified by the memory access request.
If the corresponding physical address (TLB Hit) is found in the TLB, memory access may be performed based on the found physical address.
If the corresponding physical address (TLB Miss) is not found in the TLB, step 104 may be performed as follows.
It should be noted that, in the related art, if the corresponding physical address is not found in the TLB, the physical address may be found based on the process page table of the memory. By adopting the memory access scheme provided by the specification, if the corresponding physical address is not found in the TLB, the following step 104 can be directly executed, the physical address is not found on the basis of the process page table of the memory, or the physical address is found on the basis of the process page table of the memory in parallel when the following step 104 is executed, and the specification is not particularly limited.
Step 104, sending an address detection request, where the address detection request carries the virtual address, so that a CPU core receiving the address detection request searches a TLB for the physical address corresponding to the virtual address, and returns an address detection response, where the physical address corresponding to the virtual address is found, where the address detection response carries the found physical address.
Based on the lookup result of the foregoing step 102, in the case where the ASID and the physical address corresponding to the virtual address are not cached in the TLB, the CPU core may construct an address probe request, and add the ASID and the virtual address to the address probe request.
The address detection request can be constructed based on a cache detection Protocol (Snoop Protocol), which is a strategy for solving cache consistency of the multi-core processor in a hardware mode. Of course, in other examples of the specification, the address probe request may be constructed based on other protocols, which is not particularly limited in this specification.
In one example, the CPU core may broadcast the constructed address probe request to all CPU cores.
In another example, the CPU core may also send a constructed address probe request to the designated CPU core. For example, the CPU core may first read a target core identification of a target CPU core from a specified register, and then send the address probe request to the target CPU core based on the target core identification.
The target CPU core may be a CPU core where other threads are located in a process where the thread which initiates the memory access request belongs, or a CPU core where the thread which initiates the memory access request is located before being scheduled to the current CPU core. The target core identification of the target CPU core may be written to the specified register by a scheduler.
In the case of sending the address probe request to a designated CPU core, the target core identification may be added as a parameter to the address probe request after being read from a register.
In this specification, after receiving address probe requests sent by other CPU cores, the CPU core may query in its TLB whether to cache the ASID carried by the address probe request and the physical address corresponding to the virtual address.
In the case where the physical address corresponding to the ASID and virtual address is cached in its TLB, the physical address may be added to the address probe response and returned to the CPU core that sent the address probe request.
In the case that the physical address corresponding to the ASIM and the virtual address is not cached in the TLB, an address probe response may also be returned to the CPU core that sent the address probe request, where the address probe response does not carry the physical address.
Referring to the CPU block diagram shown in fig. 2, the CPU cores are connected by a Bus, such as a Ring Bus (Ring Bus), a MESH network Bus, etc. The transmission of address probe requests and address probe responses can be realized among the CPU cores through buses.
It should be noted that the transmission directions of the address probe request/address probe response shown in fig. 2 are only exemplary, and represent that the address probe request and the address probe response are transmitted between the CPU cores, and do not represent an actual transmission path.
And step 106, responding to the received address detection response, storing the mapping relation between the physical address and the virtual address carried in the address detection response into a TLB, and performing memory access based on the physical address.
In this specification, after receiving an address probe response to an address probe request sent by the CPU core, the CPU core may extract the physical address from the address probe response, and then store the mapping relationship among the physical address, the virtual address, and the ASID in the TLB. Alternatively, memory access may be based on the physical address.
The CPU core may find a physical address corresponding to the virtual address based on a process page table of the memory, in a case where an address probe response to an address probe request sent by the CPU core is not received.
As can be seen from the above description, in the case where the CPU core in the present specification does not store the physical address corresponding to the virtual address in the TLB, the CPU core sends an address probe request to the other CPU cores, and the other CPU cores find whether the physical address corresponding to the virtual address is stored in the respective TLB, and may add the found physical address to the address probe response to return. The CPU core may then store the mapping between the physical address and the virtual address in its TLB and perform memory accesses.
By adopting the technical scheme provided by the specification, TLB cache sharing among CPU cores can be realized, repeated process page table inquiry by the CPU cores in a scene of running across the CPU cores is greatly reduced, waste of CPU core processing resources is reduced, time consumption of address translation under the condition of TLB Miss is effectively shortened, and IO performance is improved.
On the other hand, the technical scheme provided by the specification can be realized based on the existing hardware and the cache detection protocol, does not need to add new hardware, and has low cost and high feasibility.
The specific implementation of the present specification is described in detail below based on the aforementioned two scenarios running across CPU cores, respectively.
1. Multithreading of the same process runs on different CPU cores
In this specification, multiple threads of the same process share memory space, and use the same ASID, i.e., all ASIDs to which the process to which it belongs is bound.
TABLE 1
Referring to the example of Table 1, assume that a process includes 4 threads, thread 1-thread 4, where thread 1 and thread 2 run in CPU core 8, thread 3 and thread 4 run in CPU core 12, and ASIDs used by threads 1-4 are ASID 7.
Assuming that the thread 1 performs a memory access, the virtual address of the access is 0x800000, the MMU of the cpu core 8 searches the TLB 8, the physical addresses corresponding to the virtual addresses 0x800000 and ASID 7 are not stored in the TLB 8, and further the corresponding physical addresses are searched based on the process page table of the memory, and the searched physical addresses and the mapping relationship between the virtual addresses and ASID are stored in the TLB 8. CPU core 8 may then perform a memory access based on the queried physical address.
ASID |
7 |
Virtual address |
0x800000 |
Physical address |
0x2000 |
TABLE 2
Also, assuming that the physical address being queried is 0x2000, the TLB 8 may store the TLB entries shown in Table 2, above. It is noted that table 2 is merely an exemplary illustration, and in actual implementations, TLB entries may also include other fields for access rights (read or write), page type, etc.
If the thread 3 also needs to perform a memory access, the virtual address to be accessed is also 0x800000, the MMU of the CPU core 12 searches the TLB 12, and the physical addresses corresponding to the virtual addresses 0x800000 and ASID 7 are not stored in the TLB 12, in the related art, the CPU core 12 performs a physical address query based on the process page table of the memory. To avoid such repeated queries, using the solution provided in this specification, CPU core 12 may construct an address probe request to which virtual addresses 0x800000 and ASID 7 are added.
In one example, CPU core 12 may broadcast the address probe request to all CPU cores over a bus. Under the architecture of the multi-core CPU, the bus design can be realized by adopting an MESH network, and the delay is smaller.
In another example, referring to FIG. 3, CPU core 12 may send the address probe request to CPU core 8 where threads 1-2 belonging to the same process as thread 3 are located.
In this example, the CPU core 12 may first read the core identification 8 of the CPU core 8 from the specified register, and then add the core identification 8 as a parameter to the address probe request as well. Taking the Snoop protocol as an example, the address probe request is sent to a Snoop Agent, and the Snoop Agent may send the address probe request to the CPU core 8 according to the core identifier 8 carried in the address probe request.
Wherein the core identification 8 in the register is writable by the scheduler. The scheduler is aware of all threads under the same process, and the CPU cores that each thread runs, and the scheduler may write the core identification of the CPU core that each thread runs under the process in the specified registers of these CPU cores.
Still taking the case shown in table 1 as an example, the thread under the process runs in two CPU cores, i.e., CPU core 8 and CPU core 12, the scheduler may write core identification 8 into the specified registers of CPU core 12, and may write core identification 12 into the specified registers of CPU core 8. Of course, the current CPU core may not be excluded, and the core identifier 8 and the core identifier 12 may be written into the specified registers of the CPU core 8 and the CPU core 12, respectively. It is noted that in the example of table 1, the process runs in two CPU cores, and in other examples, 3 or more CPU cores may be run, which is not particularly limited in this specification.
In this specification, please continue to refer to fig. 3, after receiving the address probe message sent by the CPU core 12, the CPU core 8 searches the TLB 8 for the physical address 0x800000 and the physical address 0x2000 corresponding to the ASID 7, and then adds the physical address 0x2000, the virtual address 0x800000 and the ASID 7 to the address probe response and returns the address probe response to the CPU core 12, and the CPU core 12 may store the mapping relationship between the physical address 0x2000, the virtual address 0x800000 and the ASID 7 in the TLB 12, that is, also form the TLB table entry shown in table 2. CPU core 12 may also have memory access based on physical address 0x 2000.
In the present specification, the CPU core 8 may add only the physical address 0x2000 to the address probe response, which is not particularly limited in the present specification.
In the process of implementing physical address detection based on the Snoop protocol, forwarding of an address detection request and an address detection response is typically implemented by a Snoop Agent, for example, after receiving address detection responses returned by different CPU cores, the Snoop Agent filters out address detection responses that do not carry a physical address, and may perform deduplication on address detection responses that carry the same search result and that are returned by different CPU cores, for example, return an address detection response carrying the physical address to the CPU core that sends the address detection request.
As can be seen from the above description, in the scenario that multiple threads of the same process run on different CPU cores, when the CPU cores do not store physical addresses corresponding to virtual addresses in the TLB, an address probe request may be sent to other CPU cores, and the other CPU cores may search whether the physical addresses corresponding to virtual addresses are stored in the respective TLB, and may add the searched physical addresses to the address probe response and return the same. The CPU core may then store the mapping between the physical address and the virtual address in its TLB and perform memory accesses.
By adopting the technical scheme provided by the specification, TLB cache sharing among CPU cores can be realized, repeated process page table inquiry by the CPU cores under the scene that multithreading of the same process operates on different CPU cores is greatly reduced, waste of CPU core processing resources is reduced, time consumption of address translation under the condition of TLB Miss is effectively shortened, and IO performance is improved.
On the other hand, the technical scheme provided by the specification can be realized based on the existing hardware and the cache detection protocol, does not need to add new hardware, and has low cost and high feasibility.
2. Process migration
In the related art, TLB entries may have three states, valid, state, and Invalid, respectively.
Valid indicates that the corresponding TLB entry is Valid;
State indicates that the corresponding TLB entry is temporarily invalidated and can be reactivated to Valid state;
Invalid indicates that in the event that the memory corresponding to the TLB entry is released, such as a process destroying, the corresponding TLB entry is destroyed, and the destroyed TLB entry cannot be re-activated.
After generating the TLB table entry, the state of the TLB table entry is Valid. In the process of process switching, if the process is swapped out, the TLB entry corresponding to the ASID bound by the swapped out process is set to be in an invalid state. For example, the operating system sends a TLB invalidation instruction after the process is swapped out, where the TLB invalidation instruction specifies an ASID bound to the swapped out process, and based on the TLB invalidation instruction, the TLB entry corresponding to the specified ASID is set from the Valid state to the invalid state. After the process is switched back, when the memory access is performed, the MMU queries the TLB entry in the hit state, and then can set the state of the hit TLB entry from the invalid state to the Valid.
After the process is destroyed, the operating system may send a TLB destroy instruction (TLB Shootdown), where the TLB destroy instruction specifies an ASID bound to the destroyed process, and based on the TLB destroy instruction, a TLB entry corresponding to the specified ASID (including a TLB entry in a Valid state and a TLB entry in an Invalid state) may be completely destroyed, for example, the TLB entry may be deleted, so as to be in an Invalid state.
In this specification, the scheduler may perform process scheduling based on the load condition of each CPU core, for example, a certain process is run in a first CPU core, the process is scheduled to run in a second CPU core, and so on. Where scheduling a process generally refers to scheduling all threads under the process.
In the related art, under the condition of process scheduling, an operating system sends a TLB destroy instruction to a first CPU core, so as to thoroughly destroy a TLB entry corresponding to an ASID bound by the process in a TLB of the first CPU core. After the process is scheduled to the second CPU core, the second CPU core still needs address translation based on the process page table of the memory when performing the memory access.
In order to avoid repeated inquiry, the technical scheme provided by the specification is adopted, under the condition of process scheduling, on one hand, the operating system sends a TLB invalidation instruction to replace a TLB destroying instruction so as to avoid thoroughly destroying relevant TLB entries. On the other hand, the second CPU core may construct an address probe request under the TLB Miss condition, requesting the other CPU cores to assist in the physical address lookup.
Referring to fig. 4, in a process scheduling scenario, the memory access method provided in the present disclosure may include the following steps:
In step 402, the first CPU core receives a TLB invalidation instruction, and sets a TLB entry bound by the calling process in the first TLB to an invalidated state.
In this embodiment, after a process is scheduled out of the first CPU core, unlike the related art, the operating system does not send a TLB destroy instruction to the first CPU core, but sends a TLB invalidate instruction to the first CPU core, where an ASID to which the scheduled out process is bound is specified.
In response to the TLB invalidation instruction, the first CPU core sets a TLB entry in the first TLB (i.e., the TLB of the first CPU core) corresponding to the ASID from a Valid state to an invalid state.
In other words, by adopting the technical scheme provided by the specification, during process scheduling, the TLB entries in the CPU core where the process is originally located are not thoroughly destroyed, but are put into a temporary invalid state.
In step 404, the second CPU core, in response to the memory access request of the called process, searches the second TLB for the physical address corresponding to the virtual address.
In this embodiment, when the process or the thread under the process scheduled to run by the second CPU core performs the memory access, a memory access request is initiated to the second CPU core. The second CPU core then first looks up the ASID and the physical address corresponding to the virtual address in the second TLB (i.e., the TLB of the second CPU core). If a corresponding physical address (TLB Hit) is found, memory access may be performed based on the physical address. If the corresponding physical address (TLB Miss) is not found, step 406 may be performed as follows.
In step 406, the second CPU core sends an address probe request to the first CPU core if the physical address is not found.
Based on the query result of the foregoing step 404, the second CPU core may construct an address probe request without finding the physical address, and add the virtual address to be accessed and the ASID to which the process is bound to the address probe request.
The second CPU core sends the address probe request. For example, the address probe request may be broadcast and sent, or the address probe request may be sent to the CPU core where the process is located before being scheduled, i.e., the first CPU core.
In this embodiment, the construction and transmission of the address detection request may refer to the specific implementation process of the foregoing embodiment, which is not described herein in detail.
It should be noted that, in the case where the second CPU core sends the address probe request to the first CPU core based on the first core identifier of the first CPU core in the register, the first core identifier in the specified register may be written by the scheduler after the process is scheduled.
In step 408, the first CPU core responds to the address detection request and searches the first TLB for the physical address corresponding to the virtual address.
In this embodiment, the first CPU core searches, in response to the address probe request, TLB entries in a valid state and an invalid state in the first TLB, so as to perform a query of a physical address.
If the query hits the TLB entry in the valid state, it may be stated that different threads with a high probability of being the same process run in different CPU cores, that is, the thread initiating the memory access request in the second CPU core and some threads running in the first CPU core belong to the same process.
If the TLB entry in the invalid state is searched, a scenario that the process is scheduled and migrated with high probability can be described, namely, the process is originally operated in the first CPU core and then is migrated to the second CPU core by the scheduler.
In other words, for the CPU core that receives the address probe request, when querying the TLB entry, it is required to query both the TLB entry in the valid state and the TLB entry in the invalid state, and after the query hits, the physical address that is found can be returned, and the CPU core does not need to pay attention to a specific application scenario.
In step 410, the first CPU core adds the found physical address to the address probe response and returns the address probe response to the second CPU core.
In step 412, after receiving the address probe response, the second CPU core stores the mapping relationship between the physical address and the virtual address in the second TLB, and performs a memory access based on the physical address.
In this embodiment, the implementation of steps 410-412 may be described with reference to the previous embodiments.
In this embodiment, if the second CPU core does not receive the address probe response carrying the physical address, for example, the second CPU core does not receive the address probe response carrying the physical address within a preset duration, the second CPU core may perform the query of the physical address based on the process page table of the memory.
As can be seen from the above description, in the scenario of process migration, when the CPU core does not store the physical address corresponding to the virtual address in the TLB, the address probe request may be sent to other CPU cores, and the other CPU cores may search whether the physical address corresponding to the virtual address is stored in the respective TLB, and may add the searched physical address to the address probe response and return the same. The CPU core may then store the mapping between the physical address and the virtual address in its TLB and perform memory accesses.
By adopting the technical scheme provided by the specification, TLB cache sharing among CPU cores can be realized, repeated process page table inquiry by the CPU cores in a process migration scene is greatly reduced, waste of CPU core processing resources is reduced, address translation time consumption under the condition of TLB Miss is effectively shortened, and IO performance is improved.
On the other hand, the technical scheme provided by the specification can be realized based on the existing hardware and the cache detection protocol, does not need to add new hardware, and has low cost and high feasibility.
Corresponding to the foregoing embodiments of the memory access method, the present disclosure further provides embodiments of the memory access device.
Embodiments of the memory access device of the present specification may be applied in a CPU core of a computer system, the CPU core including a TLB for caching mappings between virtual addresses and physical addresses. Referring to fig. 5, the memory access device 500 includes an address searching unit 501, an address detecting unit 502, a memory access unit 503 and a status marking unit 504.
The address searching unit 501 responds to a memory access request and searches a physical address corresponding to a virtual address carried by the memory access request in the TLB;
An address detection unit 502, when no physical address corresponding to the virtual address is found, sends an address detection request, where the address detection request carries the virtual address, so that a CPU core that receives the address detection request searches a TLB of the physical address corresponding to the virtual address, and returns an address detection response when the corresponding physical address is found, where the address detection response carries the found physical address;
The memory access unit 503 responds to the received address detection response, stores the mapping relationship between the physical address and the virtual address carried in the address detection response into the TLB, and performs memory access based on the physical address.
Optionally, the address detection unit 502 reads a target core identifier of a target CPU core from a register, and sends an address detection request to the target CPU core based on the target core identifier.
Optionally, the address detection unit 502 broadcasts and sends an address detection request.
Optionally, the target CPU core is a CPU core where other threads in the process to which the thread that initiates the memory access request belongs are located.
Optionally, the target CPU core is a CPU core where a thread that initiates the memory access request is located before being scheduled to the present CPU core.
Optionally, the target core identification is written by a scheduler.
Optionally, the method further comprises:
a state marking unit 504, configured to receive a TLB invalidation instruction, where the TLB invalidation instruction is sent after a thread that initiates the memory access request is scheduled to another CPU core;
In response to the TLB invalidation instruction, marking a mapping relationship between a virtual address and a physical address specified by the TLB invalidation instruction in the TLB as an invalidation state;
The method further comprises the steps of:
The address searching unit 501 searches the TLB for a physical address corresponding to the virtual address in the valid state and the invalid state after receiving the address detection request sent by the other CPU core.
Optionally, the address lookup unit 501, in a case that the address probe response is not received, looks up a physical address corresponding to the virtual address based on a process page table of the memory.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present description. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. A typical implementation device is a computer, which may be in the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or a combination of any of these devices.
In correspondence with the foregoing embodiment of the memory access method, the present disclosure further provides a CPU core, where the CPU core includes a bypass translation cache TLB, where the TLB is configured to cache a mapping relationship between a virtual address and a physical address, and the CPU core is configured to:
Responding to a memory access request, and searching a physical address corresponding to a virtual address carried by the memory access request in a TLB (TLB);
Under the condition that the physical address corresponding to the virtual address is not found, an address detection request is sent, wherein the address detection request carries the virtual address, so that a CPU core receiving the address detection request searches a physical address corresponding to the virtual address in a TLB (TLB), and under the condition that the corresponding physical address is found, an address detection response is returned, and the address detection response carries the found physical address;
And responding to the received address detection response, storing the mapping relation between the physical address and the virtual address carried in the address detection response into a TLB, and performing memory access based on the physical address.
Optionally, the sending an address probe request includes:
The broadcast transmission address probe request.
Optionally, the sending an address probe request includes:
reading a target core identifier of a target CPU core from a register;
And sending an address detection request to the target CPU core based on the target core identifier.
Optionally, the target CPU core is a CPU core where other threads in the process to which the thread that initiates the memory access request belongs are located.
Optionally, the target CPU core is a CPU core where a thread that initiates the memory access request is located before being scheduled to the present CPU core.
Optionally, the target core identification is written by a scheduler.
Optionally, the method further comprises:
Receiving a TLB (TLB) invalidation instruction, wherein the TLB invalidation instruction is sent after a thread initiating the memory access request is scheduled to other CPU cores;
In response to the TLB invalidation instruction, marking a mapping relationship between a virtual address and a physical address specified by the TLB invalidation instruction in the TLB as an invalidation state;
The CPU core is further configured to:
After receiving address detection requests sent by other CPU cores, searching physical addresses corresponding to virtual addresses in a valid state and an invalid state in the TLB.
Optionally, the method further comprises:
and under the condition that the address detection response is not received, the process page table based on the memory searches the physical address corresponding to the virtual address.
Corresponding to the foregoing embodiments of the memory access method, the present disclosure further provides a computer readable storage medium having a computer program stored thereon, the program when executed by the CPU core implementing the steps of:
Responding to a memory access request, and searching a physical address corresponding to a virtual address carried by the memory access request in a TLB (TLB);
Under the condition that the physical address corresponding to the virtual address is not found, an address detection request is sent, wherein the address detection request carries the virtual address, so that a CPU core receiving the address detection request searches a physical address corresponding to the virtual address in a TLB (TLB), and under the condition that the corresponding physical address is found, an address detection response is returned, and the address detection response carries the found physical address;
And responding to the received address detection response, storing the mapping relation between the physical address and the virtual address carried in the address detection response into a TLB, and performing memory access based on the physical address.
Optionally, the sending an address probe request includes:
The broadcast transmission address probe request.
Optionally, the sending an address probe request includes:
reading a target core identifier of a target CPU core from a register;
And sending an address detection request to the target CPU core based on the target core identifier.
Optionally, the target CPU core is a CPU core where other threads in the process to which the thread that initiates the memory access request belongs are located.
Optionally, the target CPU core is a CPU core where a thread that initiates the memory access request is located before being scheduled to the present CPU core.
Optionally, the target core identification is written by a scheduler.
Optionally, the method further comprises:
Receiving a TLB (TLB) invalidation instruction, wherein the TLB invalidation instruction is sent after a thread initiating the memory access request is scheduled to other CPU cores;
In response to the TLB invalidation instruction, marking a mapping relationship between a virtual address and a physical address specified by the TLB invalidation instruction in the TLB as an invalidation state;
The method further comprises the steps of:
After receiving address detection requests sent by other CPU cores, searching physical addresses corresponding to virtual addresses in a valid state and an invalid state in the TLB.
Optionally, the method further comprises:
and under the condition that the address detection response is not received, the process page table based on the memory searches the physical address corresponding to the virtual address.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The foregoing description of the preferred embodiments is provided for the purpose of illustration only, and is not intended to limit the scope of the disclosure, since any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the disclosure are intended to be included within the scope of the disclosure.