CN117785738B - Page table prefetching method, device, chip and storage medium - Google Patents
Page table prefetching method, device, chip and storage medium Download PDFInfo
- Publication number
- CN117785738B CN117785738B CN202410199711.XA CN202410199711A CN117785738B CN 117785738 B CN117785738 B CN 117785738B CN 202410199711 A CN202410199711 A CN 202410199711A CN 117785738 B CN117785738 B CN 117785738B
- Authority
- CN
- China
- Prior art keywords
- mmu
- page table
- maintenance operation
- prefetching
- tlb
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The invention provides a page table prefetching method, a page table prefetching device, a page table prefetching chip and a storage medium, wherein the page table prefetching method comprises the following steps: acquiring a search request for the MMU, and determining the search type of the chip search request; when the chip searching type is a chip TLB maintenance operation request, judging whether the chip TLB maintenance operation request meets MMU prefetching conditions, wherein the chip MMU prefetching conditions comprise one of meeting and not meeting; when the MMU prefetching condition of the chip is satisfied, generating an MMU prefetching address, and judging whether a chip TLB maintenance operation request is set or not; when the chip TLB maintenance operation request is not set, polling the chip TLB maintenance operation request; when the chip TLB maintenance operation request is set, starting MMU page table prefetching, carrying out hardware page table lookup, and finishing updating of TLB cache information. The beneficial effects of the invention are as follows: the page table lookup delay is reduced and the processor performance is improved.
Description
Technical Field
The present invention relates to the field of computer and processor microarchitecture design technologies, and in particular, to a page table prefetching method, a device, a chip, and a storage medium.
Background
In modern high performance processor designs, in order to better support concurrent running of multiple programs and memory management of an operating system, virtual addresses are mostly used for data access, and physical addresses are used for main memory access, so that hardware is required to complete translation of virtual address to physical address mapping, a memory management unit (MMU, memory Management Unit) is usually involved to be responsible for translation management from virtual address to physical address, and a hardware hierarchical query translation (Page Table Walk, PTW) is performed on a stored Page Table to obtain a final physical address.
Taking RISC-V processor as an example, the address translation process is as shown in FIG. 1, and the 48-bit virtual address needs to generate a physical address through four-level page table walk. At most 4 memory accesses are required for one address translation, and then a maximum of thousands of clock cycles are required to complete, and the maximum delay is doubled in the case of virtualized nested translations. In current processors, a buffer of virtual address to physical address mapping is mostly set, called TLB (Translation Lookaside Buffer, address translation lookaside buffer). When the TLB is hit, the mapping from the virtual address to the physical address does not need to undergo four-level searching, and time-consuming storage operation is avoided.
In the process of allocating page tables, page table errors (page_fault) often occur in the existing operating system, and the system can repair page tables in an exception handler and return an error instruction PC; in addition, the operating system may also dynamically modify Page Tables (PTEs) during running. In the above two application scenarios, the TLB currently stores the old mapping relationship, and to obtain the correct new mapping relationship, software is required to execute the TLB maintenance operation instruction, i.e. invalidate the old cache entry. Current high performance processors in order to manage page table caches more efficiently, TLB maintenance operations may implement different granularity management: (1) full refresh; (2) match only virtual address refreshes; (3) matching only address space identification number refreshes; (4) And simultaneously matching the virtual address with the address space identification number for refreshing. And (3) performing TLB maintenance refreshing by matching the virtual addresses in the mode (2) and the mode (4). The general processing mode of the current processor is that after the TLB maintenance operation is completed, TLB hit failure occurs when the address is accessed next time, MMU PTW request is reinitiated, multi-stage page table searching is performed, the address translation process consumes longer time, and the access delay cost is large.
Disclosure of Invention
The main purpose of the embodiment of the invention is to provide a page table prefetching method, a page table prefetching device, a page table prefetching chip and a page table prefetching storage medium, so that page table lookup delay is reduced, and processor performance is improved.
One aspect of the present invention provides a page table prefetching method, including:
Obtaining a search request for an MMU, and determining a search type of the search request, wherein the search type comprises a TLB maintenance operation request and an access request;
when the search type is the TLB maintenance operation request, judging whether the TLB maintenance operation request meets MMU prefetching conditions, wherein the MMU prefetching conditions comprise one of meeting and not meeting;
When the MMU prefetching condition is satisfied, generating an MMU prefetching address, and judging whether the TLB maintenance operation request is set or not; when the TLB maintenance operation request is not set, polling the TLB maintenance operation request;
When the TLB maintenance operation request is set, starting MMU page table prefetching, carrying out hardware page table lookup, and finishing updating of TLB cache information;
The determining whether the TLB maintenance operation request satisfies an MMU prefetch condition includes: an rs1 source operand and an rs2 source operand based on a RISC-V instruction set in the TLB maintenance operation request are obtained, wherein the rs1 source operand is used for representing whether a virtual address is matched or not, and the rs2 source operand is used for representing an address space identification number; if the virtual address matched with the rs1 source operand is refreshed by the TLB, the MMU prefetching condition is met; if the object rs1 of the rs1 source operand is a zero register, the MMU prefetch condition is not satisfied, wherein the zero register is used for characterizing the match of the ignored virtual address.
The page table prefetching method of claim wherein the method further comprises:
when the search type is an access request, executing a corresponding multi-stage page table search according to the search request, wherein the access request comprises one of an instruction PC instruction and an LD/ST instruction.
The page table prefetching method according to claim, wherein when an MMU prefetching condition is satisfied, generating an MMU prefetching address, and determining whether the TLB maintenance operation request is set, includes:
Determining an MMU pre-fetch address according to the rs1 source operand, wherein the MMU pre-fetch address comprises one of a virtual address VA and a client physical address GPA;
And acquiring a completion flag signal of the TLB maintenance operation request, and determining whether the TLB maintenance operation request is set or not through the flag signal.
The method for prefetching a page table according to claim, wherein when the TLB maintenance operation request is not set, polling the TLB maintenance operation request includes:
the polling process of the flag signal is performed at the rising edge of each clock.
According to the page table prefetching method, when the TLB maintenance operation request is set, starting MMU page table prefetching, and performing hardware page table lookup to complete updating of TLB cache information, including:
Obtaining page table base addresses of an address translation register and a protection register of an MMU, and obtaining the MMU prefetch address, wherein the address translation register comprises a MODE field, an ASID field, and a physical page number PPN of a root page table, wherein the MODE field comprises Bare MODE, SV39 MODE, and SV48 MODE, wherein Bare MODE does not perform address translation, SV39 MODE comprises at most three levels of page table translation, and SV48 MODE comprises at most four levels of page table translation;
And performing page table prefetching according to the base address of the address translation register, the page table base address of the protection register and the MMU prefetching address by adopting a multi-stage page table inquiry mode.
The page table prefetching method of claim wherein the method further comprises:
And when the MMU prefetching condition is not satisfied, executing TLB maintenance processing only according to the TLB maintenance operation request.
Another aspect of an embodiment of the present invention provides a page table prefetching apparatus, including:
The search identification module is used for acquiring a search request for the MMU and determining a search type of the search request, wherein the search type comprises a TLB maintenance operation request and an access request;
The prefetch condition judging module is used for judging whether the TLB maintenance operation request meets MMU prefetch conditions or not when the search type is the TLB maintenance operation request, wherein the MMU prefetch conditions comprise one of meeting and not meeting;
The prefetch address determining module is used for generating an MMU prefetch address and judging whether the TLB maintenance operation request is set or not when the MMU prefetch condition is satisfied; when the TLB maintenance operation request is not set, polling the TLB maintenance operation request;
and the page table prefetching module is used for starting MMU page table prefetching when the TLB maintenance operation request is set, and performing hardware page table lookup to finish updating of TLB cache information.
Another aspect of an embodiment of the invention provides a chip, including a processor;
The processor is configured to perform the method as described hereinbefore.
Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, cause the computer device to perform the method described previously.
The beneficial effects of the invention are as follows: the corresponding page table entry is fetched to the TLB cache, the delay of hundreds of thousands of cycles is changed into the delay of a few cycles (under the condition that the prefetching delay is completely hidden), the page table searching delay is reduced, and the performance of a processor is improved; only when judging that the TLB maintenance operation meets the prefetching condition, the rest control and data paths of the page table translation path can borrow the normal page table processing path through the prefetching request; the existing TLB prefetching algorithm is not affected, the method can be overlapped with the existing algorithm for use, and the prefetching efficiency is improved.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a schematic diagram of address translation for a RISC-V processor.
FIG. 2 is a schematic diagram of a page table prefetch process according to an embodiment of the present invention.
FIG. 3 is an instruction format schematic of a RISC-V TLB maintenance instruction.
FIG. 4 is a flowchart illustrating a MMU prefetch condition determination process according to an embodiment of the present invention.
FIG. 5 is a flowchart illustrating an MMU prefetch process according to an embodiment of the present invention.
FIG. 6 is a schematic diagram of an apparatus for address translation and page table prefetching according to an embodiment of the invention.
FIG. 7 is a flow chart of address translation and page table prefetching according to an embodiment of the invention.
FIG. 8 is a diagram of a page table prefetching apparatus according to an embodiment of the invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. In the following description, suffixes such as "module", "part" or "unit" for representing elements are used only for facilitating the description of the present invention, and have no particular meaning in themselves. Thus, "module," "component," or "unit" may be used in combination. "first", "second", etc. are used for the purpose of distinguishing between technical features only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated. In the following description, the continuous reference numerals of the method steps are used for facilitating examination and understanding, and the technical effects achieved by the technical scheme of the invention are not affected by adjusting the implementation sequence among the steps in combination with the overall technical scheme of the invention and the logic relations among the steps. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
Term interpretation:
MMU, memory management unit (Memory Management Unit), a hardware component in the computer system, responsible for translating logical addresses (generated by the CPU) into physical addresses (for accessing main memory).
LD/ST, which means load/store data, wherein LD (load) instructions are used to load data from memory into registers, ST (store) instructions are used to store data into memory.
TLB, a cache, for storing virtual to physical address mappings.
RISC-V, an open source instruction set architecture.
PTE, page table entry for storing mapping information between virtual and physical pages.
Page offset, page offset, refers to the number of bits in a virtual address or physical address that are used to represent the offset within a Page.
PPN, physical page number, a number or index that uniquely identifies a page in physical memory.
VPN, virtual page number, each virtual page corresponds to a page table entry.
Referring to fig. 2, fig. 2 is a schematic diagram of a page table prefetching flow according to an embodiment of the present invention, which includes steps S100 to S400:
S100, acquiring a search request of the MMU, and determining a search type of the search request, wherein the search type comprises a TLB maintenance operation request and an access request.
In some embodiments, the access request includes an instruction PC instruction and an LD/ST instruction, and at this time, performing a multi-level page table lookup according to the PC instruction and the LD/ST instruction according to a corresponding normal flow;
In some embodiments, referring to the instruction format schematic of the RISC-V TLB maintenance instruction shown in FIG. 3, it includes funct, rs2, rs1, funct3, rd, and opcode, where funct3 and funct are functional fields, where rs1 and rs2 are rs1 source operands (fields) and rs2 source operands (fields), respectively, representing operands to a first source register (rs 1) and a second source register (rs 2), rd is a destination register field, and opcode is an instruction opcode field. The sfence. Vma is an instruction based on RISC-V architecture, vaddr is a virtual address.
S200, when the lookup type is the TLB maintenance operation request, judging whether the TLB maintenance operation request meets MMU prefetching conditions, wherein the MMU prefetching conditions comprise meeting or not meeting.
In some embodiments, referring to the MMU prefetch condition determination flow diagram shown in fig. 4, when the lookup type is a TLB maintenance operation request, the MMU prefetch condition determination flow includes, but is not limited to, steps S210 to S220:
S210, an rs1 source operand and an rs2 source operand based on a RISC-V instruction set in a TLB maintenance operation request are obtained, wherein the rs1 source operand is used for representing whether a virtual address is matched or not, and the rs2 source operand is used for representing an address space identification number.
S220, if the virtual address matched with the rs1 source operand is refreshed by the TLB and the object rs1 of the rs1 source operand is not a zero register, the MMU prefetching condition is satisfied, otherwise, the MMU prefetching condition is not satisfied, wherein the zero register is used for representing the matching of the ignored virtual address.
In some embodiments, a zero register stores a constant value of 0, generally representing a null operand or constant value use.
S300, when the MMU prefetching condition is satisfied, generating an MMU prefetching address, and judging whether a TLB maintenance operation request is set or not; when the TLB maintenance operation request is not set, the TLB maintenance operation request is polled.
In some embodiments, the MMU prefetch address is determined by the virtual address VA of the rs1 operand or the guest physical address GPA of the virtual machine.
In some embodiments, the setting determination is performed by a flag signal, for example, it is determined whether the completion flag signal of the present TLB maintenance operation is set, if the completion flag signal is set to 1, it indicates that the TLB cache maintenance is updated, and if the completion flag signal is set to 0, it indicates that the cache maintenance is not updated, and checking is performed by a polling method.
In some embodiments, if the TLB maintenance operation completion flag signal is set to 1, the status update is completed for all the multi-level TLB cache information.
In some embodiments, when the TLB maintenance operation request is not set, a check is made by polling until the TLB cache maintenance update is complete and updated.
In some embodiments, the polling mode uses a rising edge at each instant to check.
S400, when the STLB maintenance operation request is set, starting MMU page table prefetching, carrying out hardware page table lookup, and finishing updating of TLB cache information.
In some embodiments, referring to the MMU prefetch flow diagram shown in FIG. 5, it includes, but is not limited to, steps S410-S420:
s410, acquiring page table base addresses of an address translation register and a protection register of an MMU, and acquiring an MMU prefetch address;
In some embodiments, wherein the address translation register includes a MODE field, an ASID field, and a physical page number PPN of the root page table, wherein the MODE field includes Bare MODEs, SV39 MODEs, and SV48 MODEs, wherein Bare MODEs do not address translation, SV39 MODEs include a maximum of three-level page table translations, and SV48 MODEs include a maximum of four-level page table translations; where ASID typically includes 16bit information, virtual addresses, for example SV48, with a maximum support of 48 bits.
S420, adopting a multi-stage page table inquiry mode to perform page table prefetching according to the base address of the address conversion register, the page table base address of the protection register.
In some embodiments, where the multi-level page table walk approach performs a page table load with reference to the address translation schematic of the RISC-V processor shown in FIG. 1, a page table walk is completed.
In some embodiments, referring to the apparatus for address translation and page table prefetch shown in FIG. 6, and referring to the flow diagram for address translation and page table prefetch shown in FIG. 7, the apparatus for address translation and page table prefetch comprises a TLB buffer 610, a page table load unit 620, and a prefetch unit 630;
The flow chart of address translation and page table prefetching in fig. 7 includes steps S710 to S750:
S710, accept MMU request, if the request is a normal fetch or LD/ST type, jump to execute step S720, if the request is TLB maintenance operation, jump to execute S730.
S720, performing TLB lookup, if the TLB buffer 610 is hit, directly returning data, and if the TLB buffer is not hit, jumping to execution S750.
S730, if the MMU prefetch condition is satisfied, the prefetch unit 630 generates a prefetch address addr; the prefetch condition is that the virtual address is valid, i.e., the sfence. Vma instruction rs1 operand is not 0 (rs 1 |=x0); the prefetch address is from the source operand rs1 of the TLB maintenance operation; if the prefetch condition is not satisfied, a TLB maintenance operation is performed normally.
S740, judging a TLB maintenance operation completion flag signal, if the completion flag signal (tlb_invalid_done) is set to 1, the prefetch unit 630 starts MMU prefetching, issues a page table walk request, and goes to execution S750; if the done flag signal is 0, then poll waits.
S750, the page table loading unit 620 loads the page table through the multi-level page table searching process, and after the completion, the TLB buffer 610 is filled, and the page table searching is finished.
FIG. 8 is a diagram of a page table prefetching apparatus according to an embodiment of the invention. The apparatus includes a lookup identification module 810, a prefetch condition determination module 820, a prefetch address determination module 830, and a page table prefetch module 840.
The lookup identification module 810 is configured to obtain a lookup request for the MMU, determine a lookup type of the lookup request, where the lookup type includes a TLB maintenance operation request and an access request; a prefetch condition determination module 820, configured to determine, when the lookup type is a TLB maintenance operation request, whether the TLB maintenance operation request satisfies an MMU prefetch condition, where the MMU prefetch condition includes one of satisfaction and unsatisfied; the prefetch address determining module 830 is configured to generate an MMU prefetch address and determine whether a TLB maintenance operation request is set when an MMU prefetch condition is satisfied; when the TLB maintenance operation request is not set, polling the TLB maintenance operation request; the page table pre-fetch module 840 is configured to start MMU page table pre-fetching when the TLB maintenance operation request is set, perform hardware page table lookup, and complete TLB cache information update.
Illustratively, with cooperation of the lookup identification module 810, the prefetch condition determination module 820, the prefetch address determination module 830, and the page table prefetch module 840 in the apparatus, the embodiment apparatus may implement any of the foregoing page table prefetch methods, that is, obtain a lookup request to the MMU, determine a lookup type of the lookup request, where the lookup type includes a TLB maintenance operation request and an access request; when the search type is a TLB maintenance operation request, judging whether the TLB maintenance operation request meets MMU prefetching conditions, wherein the MMU prefetching conditions comprise one of meeting and not meeting; when the MMU prefetching condition is satisfied, generating an MMU prefetching address, and judging whether a TLB maintenance operation request is set or not; when the TLB maintenance operation request is not set, polling the TLB maintenance operation request; when the TLB maintenance operation request is set, starting MMU page table prefetching, carrying out hardware page table lookup, and finishing updating of TLB cache information. The beneficial effects of the invention are as follows: the corresponding page table entry is fetched to the TLB cache, the delay of hundreds of thousands of cycles is changed into the delay of a few cycles (under the condition that the prefetching delay is completely hidden), the page table searching delay is reduced, and the performance of a processor is improved; only when judging that the TLB maintenance operation meets the prefetching condition, the rest control and data paths of the page table translation path can borrow the normal page table processing path through the prefetching request; the existing TLB prefetching algorithm is not affected, the method can be overlapped with the existing algorithm for use, and the prefetching efficiency is improved.
The embodiment of the invention also provides a chip, which comprises a processor and a memory;
the memory stores a program;
The processor executes a program to execute the page table prefetching method; the chip has the function of carrying and running the software system of the page table prefetching provided by the embodiment of the invention.
Embodiments of the present invention also provide a computer-readable storage medium storing a program that is executed by a processor to implement a page table prefetching method as described above.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Embodiments of the present invention also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device may read the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the aforementioned page table prefetching method.
Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present application has been described in detail, the present application is not limited to the embodiments described above, and those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present application, and these equivalent modifications or substitutions are included in the scope of the present application as defined in the appended claims.
Claims (9)
1. A method of page table prefetching comprising:
Obtaining a search request for an MMU, and determining a search type of the search request, wherein the search type comprises a TLB maintenance operation request and an access request;
when the search type is the TLB maintenance operation request, judging whether the TLB maintenance operation request meets MMU prefetching conditions, wherein the MMU prefetching conditions comprise one of meeting and not meeting;
When the MMU prefetching condition is satisfied, generating an MMU prefetching address, and judging whether the TLB maintenance operation request is set or not; when the TLB maintenance operation request is not set, polling the TLB maintenance operation request;
When the TLB maintenance operation request is set, starting MMU page table prefetching, carrying out hardware page table lookup, and finishing updating of TLB cache information;
The determining that the TLB maintenance operation request satisfies the MMU prefetch condition includes: an rs1 source operand and an rs2 source operand based on a RISC-V instruction set in the TLB maintenance operation request are obtained, wherein the rs1 source operand is used for representing whether a virtual address is matched or not, and the rs2 source operand is used for representing an address space identification number; if the virtual address matched with the rs1 source operand is refreshed by the TLB, the MMU prefetching condition is met;
The determining that the TLB maintenance operation request does not satisfy the MMU prefetch condition includes: if the object rs1 of the rs1 source operand is a zero register, the MMU prefetching condition is not satisfied, wherein the zero register is used for representing the matching of the ignored virtual address;
When the TLB maintenance operation request is set, starting MMU page table prefetching, performing hardware page table lookup, and finishing updating of TLB cache information, including: acquiring page table base addresses of an address translation register and a protection register of an MMU, and acquiring a prefetch address of the MMU; and performing page table prefetching according to the base address of the address translation register, the page table base address of the protection register and the MMU prefetching address by adopting a multi-stage page table inquiry mode.
2. The page table pre-fetching method of claim 1, further comprising:
when the search type is an access request, executing a corresponding multi-stage page table search according to the search request, wherein the access request comprises one of an instruction PC instruction and an LD/ST instruction.
3. The page table prefetch method of claim 1, wherein generating an MMU prefetch address and determining whether the TLB maintenance operation request is set when the MMU prefetch condition is satisfied, comprises:
Determining an MMU pre-fetch address according to the rs1 source operand, wherein the MMU pre-fetch address comprises one of a virtual address VA and a client physical address GPA;
And acquiring a completion flag signal of the TLB maintenance operation request, and determining whether the TLB maintenance operation request is set or not through the flag signal.
4. The page table prefetch method of claim 3, wherein polling the TLB maintenance operation request when the TLB maintenance operation request is not set comprises:
the polling process of the flag signal is performed at the rising edge of each clock.
5. The page table pre-fetching method of claim 1, wherein the address translation register comprises a MODE field, an ASID field, and a physical page number PPN of a root page table, wherein the MODE field comprises Bare MODEs, SV39 MODEs, and SV48 MODEs, wherein Bare MODEs do not address translation, SV39 MODEs comprise at most three-level page table translations, and SV48 MODEs comprise at most four-level page table translations.
6. The page table pre-fetching method of claim 1, further comprising:
And when the MMU prefetching condition is not satisfied, executing TLB maintenance processing only according to the TLB maintenance operation request.
7. A page table prefetching apparatus, comprising:
The search identification module is used for acquiring a search request for the MMU and determining a search type of the search request, wherein the search type comprises a TLB maintenance operation request and an access request;
The prefetch condition judging module is used for judging whether the TLB maintenance operation request meets MMU prefetch conditions or not when the search type is the TLB maintenance operation request, wherein the MMU prefetch conditions comprise one of meeting and not meeting;
The prefetch address determining module is used for generating an MMU prefetch address and judging whether the TLB maintenance operation request is set or not when the MMU prefetch condition is satisfied; when the TLB maintenance operation request is not set, polling the TLB maintenance operation request;
The page table prefetching module is used for starting MMU page table prefetching when the TLB maintenance operation request is set, carrying out hardware page table lookup and finishing updating of TLB cache information;
The prefetch condition determination module is further configured to determine that the TLB maintenance operation request satisfies the MMU prefetch condition, including: an rs1 source operand and an rs2 source operand based on a RISC-V instruction set in the TLB maintenance operation request are obtained, wherein the rs1 source operand is used for representing whether a virtual address is matched or not, and the rs2 source operand is used for representing an address space identification number; if the virtual address matched with the rs1 source operand is refreshed by the TLB, the MMU prefetching condition is met;
The prefetch condition determination module is further configured to determine that the TLB maintenance operation request does not satisfy the MMU prefetch condition, including: if the object rs1 of the rs1 source operand is a zero register, the MMU prefetching condition is not satisfied, wherein the zero register is used for representing the matching of the ignored virtual address;
the page table prefetch module further includes: the method comprises the steps of obtaining page table base addresses of an address translation register and a protection register of an MMU, and obtaining a prefetch address of the MMU; and performing page table prefetching according to the base address of the address translation register, the page table base address of the protection register and the MMU prefetching address by adopting a multi-stage page table inquiry mode.
8. A chip, comprising a processor;
the processor is configured to perform a page table prefetch method as recited in any one of claims 1-6.
9. A computer readable storage medium, wherein the storage medium stores a program that is executed by a processor to implement the page table prefetching method of any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410199711.XA CN117785738B (en) | 2024-02-23 | 2024-02-23 | Page table prefetching method, device, chip and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410199711.XA CN117785738B (en) | 2024-02-23 | 2024-02-23 | Page table prefetching method, device, chip and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117785738A CN117785738A (en) | 2024-03-29 |
CN117785738B true CN117785738B (en) | 2024-05-14 |
Family
ID=90380094
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410199711.XA Active CN117785738B (en) | 2024-02-23 | 2024-02-23 | Page table prefetching method, device, chip and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117785738B (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1604055A (en) * | 2003-09-30 | 2005-04-06 | 国际商业机器公司 | Apparatus and method for pre-fetching data to cached memory using persistent historical page table data |
US7243216B1 (en) * | 2003-04-25 | 2007-07-10 | Advanced Micro Devices, Inc. | Apparatus and method for updating a status register in an out of order execution pipeline based on most recently issued instruction information |
CN101535947A (en) * | 2006-09-29 | 2009-09-16 | Mips技术公司 | Twice issued conditional move instruction, and applications thereof |
CN105786717A (en) * | 2016-03-22 | 2016-07-20 | 华中科技大学 | DRAM (dynamic random access memory)-NVM (non-volatile memory) hierarchical heterogeneous memory access method and system adopting software and hardware collaborative management |
CN106537362A (en) * | 2014-07-29 | 2017-03-22 | Arm 有限公司 | A data processing apparatus, and a method of handling address translation within a data processing apparatus |
CN107430553A (en) * | 2015-03-28 | 2017-12-01 | 高通股份有限公司 | Order driving conversion for MMU prefetches |
CN110162380A (en) * | 2018-02-15 | 2019-08-23 | 英特尔公司 | For preventing the mechanism of software wing passage |
US10657067B1 (en) * | 2016-09-12 | 2020-05-19 | Xilinx, Inc. | Memory management unit with prefetch |
CN114925001A (en) * | 2022-05-18 | 2022-08-19 | 上海壁仞智能科技有限公司 | Processor, page table prefetching method and electronic equipment |
CN115407931A (en) * | 2021-05-26 | 2022-11-29 | Arm有限公司 | Mapping partition identifiers |
CN115481051A (en) * | 2022-08-08 | 2022-12-16 | Oppo广东移动通信有限公司 | Page table prefetching method and device and system on chip |
CN115879107A (en) * | 2022-10-27 | 2023-03-31 | 北京奕斯伟计算技术股份有限公司 | Computer device and access method thereof, processing device and storage medium |
CN117573574A (en) * | 2024-01-15 | 2024-02-20 | 北京开源芯片研究院 | Prefetching method and device, electronic equipment and readable storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9086890B2 (en) * | 2012-01-06 | 2015-07-21 | Oracle International Corporation | Division unit with normalization circuit and plural divide engines for receiving instructions when divide engine availability is indicated |
US11645208B2 (en) * | 2021-03-29 | 2023-05-09 | International Business Machines Corporation | Translation bandwidth optimized prefetching strategy through multiple translation lookaside buffers |
US11782845B2 (en) * | 2021-12-02 | 2023-10-10 | Arm Limited | Faulting address prediction for prefetch target address |
-
2024
- 2024-02-23 CN CN202410199711.XA patent/CN117785738B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7243216B1 (en) * | 2003-04-25 | 2007-07-10 | Advanced Micro Devices, Inc. | Apparatus and method for updating a status register in an out of order execution pipeline based on most recently issued instruction information |
CN1604055A (en) * | 2003-09-30 | 2005-04-06 | 国际商业机器公司 | Apparatus and method for pre-fetching data to cached memory using persistent historical page table data |
CN101535947A (en) * | 2006-09-29 | 2009-09-16 | Mips技术公司 | Twice issued conditional move instruction, and applications thereof |
CN106537362A (en) * | 2014-07-29 | 2017-03-22 | Arm 有限公司 | A data processing apparatus, and a method of handling address translation within a data processing apparatus |
CN107430553A (en) * | 2015-03-28 | 2017-12-01 | 高通股份有限公司 | Order driving conversion for MMU prefetches |
CN105786717A (en) * | 2016-03-22 | 2016-07-20 | 华中科技大学 | DRAM (dynamic random access memory)-NVM (non-volatile memory) hierarchical heterogeneous memory access method and system adopting software and hardware collaborative management |
US10657067B1 (en) * | 2016-09-12 | 2020-05-19 | Xilinx, Inc. | Memory management unit with prefetch |
CN110162380A (en) * | 2018-02-15 | 2019-08-23 | 英特尔公司 | For preventing the mechanism of software wing passage |
CN115407931A (en) * | 2021-05-26 | 2022-11-29 | Arm有限公司 | Mapping partition identifiers |
CN114925001A (en) * | 2022-05-18 | 2022-08-19 | 上海壁仞智能科技有限公司 | Processor, page table prefetching method and electronic equipment |
CN115481051A (en) * | 2022-08-08 | 2022-12-16 | Oppo广东移动通信有限公司 | Page table prefetching method and device and system on chip |
CN115879107A (en) * | 2022-10-27 | 2023-03-31 | 北京奕斯伟计算技术股份有限公司 | Computer device and access method thereof, processing device and storage medium |
CN117573574A (en) * | 2024-01-15 | 2024-02-20 | 北京开源芯片研究院 | Prefetching method and device, electronic equipment and readable storage medium |
Non-Patent Citations (1)
Title |
---|
基于预取的Cache替换策略;孙玉强;王文闻;巢碧霞;顾玉宛;;微电子学与计算机;20170105(第01期);第85-94页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117785738A (en) | 2024-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10740249B2 (en) | Maintaining processor resources during architectural events | |
US10089240B2 (en) | Cache accessed using virtual addresses | |
US8296547B2 (en) | Loading entries into a TLB in hardware via indirect TLB entries | |
JP5580894B2 (en) | TLB prefetching | |
EP1941375B1 (en) | Caching memory attribute indicators with cached memory data | |
CN111552654B (en) | Processor for detecting redundancy of page table walk | |
US10146545B2 (en) | Translation address cache for a microprocessor | |
US10169039B2 (en) | Computer processor that implements pre-translation of virtual addresses | |
US20090187731A1 (en) | Method for Address Translation in Virtual Machines | |
US8296518B2 (en) | Arithmetic processing apparatus and method | |
US10083126B2 (en) | Apparatus and method for avoiding conflicting entries in a storage structure | |
US20140075123A1 (en) | Concurrent Control For A Page Miss Handler | |
KR20120096031A (en) | System, method, and apparatus for a cache flush of a range of pages and tlb invalidation of a range of entries | |
JP2003067357A (en) | Nonuniform memory access (numa) data processing system and method of operating the system | |
US20080282055A1 (en) | Virtual Translation Lookaside Buffer | |
KR100634930B1 (en) | Method and apparatus for using context identifiers in cache memory | |
CN114063934B (en) | Data updating device and method and electronic equipment | |
CN117785738B (en) | Page table prefetching method, device, chip and storage medium | |
US20160170900A1 (en) | Virtual memory address range register | |
US6567907B1 (en) | Avoiding mapping conflicts in a translation look-aside buffer | |
CN115080464B (en) | Data processing method and data processing device | |
JP2001282616A (en) | Memory management system | |
KR100343940B1 (en) | Cache anti-aliasing during a write operation using translation lookahead buffer prediction bit | |
MX2008005091A (en) | Caching memory attribute indicators with cached memory data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |