[go: up one dir, main page]

CN111651379A - DAX device address translation cache method and system - Google Patents

DAX device address translation cache method and system Download PDF

Info

Publication number
CN111651379A
CN111651379A CN202010357810.8A CN202010357810A CN111651379A CN 111651379 A CN111651379 A CN 111651379A CN 202010357810 A CN202010357810 A CN 202010357810A CN 111651379 A CN111651379 A CN 111651379A
Authority
CN
China
Prior art keywords
address
dax
register
address translation
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010357810.8A
Other languages
Chinese (zh)
Other versions
CN111651379B (en
Inventor
熊子威
蒋德钧
熊劲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN202010357810.8A priority Critical patent/CN111651379B/en
Publication of CN111651379A publication Critical patent/CN111651379A/en
Application granted granted Critical
Publication of CN111651379B publication Critical patent/CN111651379B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1081Address translation for peripheral access to main memory, e.g. direct memory access [DMA]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1016Performance improvement

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明提出一种DAX设备地址转换缓存方法及系统,包括:构建由映射文件首地址寄存器MFA、对象偏移寄存器OFS、文件编号寄存器FID和地址转换表构成的DAX地址转换缓存;根据地址转换函数,将持久化地址内文件编号和持久化地址内对象偏移分别写入该文件编号寄存器和该对象偏移寄存器;快表将CPU发出的虚拟地址转换为物理地址,DAX地址转换缓存将通过该文件编号寄存器内存储的数据检索该地址转换表,将检索结果对应的首地址和对象偏移寄存器内数据相加,得到直接访问地址,并将该直接访问地址作为该虚拟地址的转换结果反馈给CPU。本发明可将地址转换函数的指令开销减少一半,并极大增强其处理多映射文件的效率。

Figure 202010357810

The invention provides a DAX device address translation cache method and system, including: constructing a DAX address translation cache composed of a mapping file first address register MFA, an object offset register OFS, a file number register FID and an address translation table; according to the address translation function , write the file number in the persistent address and the object offset in the persistent address into the file number register and the object offset register respectively; the fast table converts the virtual address sent by the CPU to a physical address, and the DAX address translation cache will pass the The data stored in the file number register retrieves the address conversion table, adds the first address corresponding to the retrieval result and the data in the object offset register to obtain the direct access address, and feeds the direct access address as the conversion result of the virtual address to CPU. The invention can reduce the instruction overhead of the address conversion function by half, and greatly enhance the efficiency of processing multi-map files.

Figure 202010357810

Description

DAX设备地址转换缓存方法及系统DAX device address translation cache method and system

技术领域technical field

本发明涉及计算机体系结构和非易失性内存研究领域,并特别涉及一种DAX设备地址转换缓存方法及系统。The invention relates to the research field of computer architecture and non-volatile memory, in particular to a DAX device address translation cache method and system.

背景技术Background technique

将文件打开并映射至内存,之后通过load和store访存指令访问该文件,是目前系统中常见的访问映射文件的方式。在这样的情况下,文件被映射为巨大的字节数组,该数组的首地址由映射函数决定。应用程序在运行期间可自由访问该数组内任意数据。这样产生的问题是,如果程序向映射文件内写入数据,并希望程序重启后能够找回这些数据,程序必须对这些数据做一定处理,使其遵循一定格式。因为映射函数不保证将同一文件映射至同一地址,因此当程序重启后,上一次运行时的映射地址在本次运行中无效。程序不能通过虚拟地址来定位上一次运行中存储至映射文件的数据。Opening and mapping the file to memory, and then accessing the file through the load and store fetching instructions is a common way to access the mapped file in the current system. In such a case, the file is mapped as a huge byte array whose first address is determined by the mapping function. The application can freely access any data in this array during runtime. The problem caused by this is that if the program writes data into the mapping file and hopes that the data can be retrieved after the program restarts, the program must process the data to make it follow a certain format. Because the mapping function does not guarantee that the same file is mapped to the same address, when the program is restarted, the mapped address in the previous run is invalid in this run. Programs cannot use virtual addresses to locate data stored in the mapped file in the previous run.

随着新一代非易失性内存(Non-Volatile Memory)的出现,部分研究人员和企业都在开发NVM开发库,期望提供友好的应用接口,以供开发人员使用。这些开发库工作在支持DAX(直接访问Direct Access)模式的持久化设备上,可以同现有操作系统良好对接的库都选择了使用文件系统对非易失内存NVM进行管理,并通过映射文件的方式访问NVM上的资源。因此,这些开发库也需要提供解决方案,让程序在重启后能够便捷地访问NVM上的数据。With the emergence of a new generation of non-volatile memory (Non-Volatile Memory), some researchers and enterprises are developing NVM development libraries, expecting to provide friendly application interfaces for developers to use. These development libraries work on persistent devices that support DAX (Direct Access) mode, and the libraries that can interface well with the existing operating system choose to use the file system to manage the non-volatile memory NVM, and use the file system to manage the non-volatile memory NVM. way to access resources on NVM. Therefore, these development libraries also need to provide solutions so that programs can easily access data on NVM after restarting.

目前常见的开发库设计为,各开发库各自为不同的存储对象维护持久化地址,在持久化地址内,储存着映射文件的编号和存储对象相对映射文件首地址的偏移量,并提供地址转换函数在运行时将持久化地址转换为虚拟地址,避免了对数据进行格式化的开销。由前文可知,这样的转换是必须的。因为虚拟地址是易失的,在每次进程重启并重新映射文件后,不保证映射文件的首地址与上一次运行时的映射地址相似,因此开发库不得不各自维护持久化地址。这样的设计虽然使得程序能够在重启后依然正常访问NVM中的数据,但也成为了NVM开发库的性能瓶颈。At present, the common development library design is that each development library maintains persistent addresses for different storage objects. In the persistent addresses, the number of the mapping file and the offset of the storage object relative to the first address of the mapping file are stored, and the address is provided. The conversion function converts persistent addresses to virtual addresses at runtime, avoiding the overhead of formatting the data. As can be seen from the foregoing, such a conversion is necessary. Because the virtual address is volatile, after each process restart and the file is remapped, it is not guaranteed that the first address of the mapped file is similar to the mapped address of the last runtime, so the development library has to maintain its own persistent address. Although this design enables the program to still access the data in the NVM normally after restarting, it also becomes the performance bottleneck of the NVM development library.

对于现有的NVM开发库而言,其地址转换函数的开销较大,占总体开销的13%左右。且这类函数无法进行软件上的优化,因为传统硬件中,地址转换是交由硬件完成的。而目前的开发库在软件上进行地址转换,将消耗更多的时间。同时,如果存在多个文件需要进行管理,地址转换函数将不得不反复查询不同文件在本次运行时的首地址,效率极为底下。又由于地址转换函数的逻辑十分简单,代码也极其简短,因此软件层面的优化非常困难。For the existing NVM development library, the overhead of its address translation function is relatively large, accounting for about 13% of the overall overhead. And such functions cannot be optimized in software, because in traditional hardware, address translation is done by hardware. However, the current development library performs address translation in software, which will consume more time. At the same time, if there are multiple files that need to be managed, the address conversion function will have to repeatedly query the first address of different files in this run, which is extremely inefficient. And because the logic of the address conversion function is very simple and the code is extremely short, optimization at the software level is very difficult.

发明内容SUMMARY OF THE INVENTION

本发明在硬件上设计了DAX地址转换缓存,为开发库提供硬件功能以加速地址转换,提供使用这些开发库的应用的运行效率。The present invention designs a DAX address translation cache on hardware, provides hardware functions for development libraries to accelerate address translation, and provides the operating efficiency of applications using these development libraries.

具体来说针对现有技术的不足,本发明提出一种DAX设备地址转换缓存方法,其中包括:Specifically, in view of the deficiencies of the prior art, the present invention proposes a DAX device address translation caching method, which includes:

步骤1、构建由映射文件首地址寄存器MFA、对象偏移寄存器OFS、文件编号寄存器FID和地址转换表构成的DAX地址转换缓存;Step 1, construct the DAX address translation cache consisting of mapping file first address register MFA, object offset register OFS, file number register FID and address translation table;

步骤2、根据地址转换函数,将持久化地址内文件编号和持久化地址内对象偏移分别写入该文件编号寄存器和该对象偏移寄存器;Step 2, according to the address conversion function, write the file number in the persistent address and the object offset in the persistent address into the file number register and the object offset register respectively;

步骤3、快表将CPU发出的虚拟地址转换为物理地址,DAX地址转换缓存将通过该文件编号寄存器内存储的数据检索该地址转换表,将检索结果对应的首地址和对象偏移寄存器内数据相加,得到直接访问地址,并将该直接访问地址作为该虚拟地址的转换结果反馈给CPU。Step 3. The fast table converts the virtual address sent by the CPU into a physical address, and the DAX address translation cache retrieves the address translation table through the data stored in the file number register, and retrieves the first address corresponding to the retrieval result and the data in the object offset register Add up to obtain the direct access address, and feed the direct access address to the CPU as the conversion result of the virtual address.

所述的DAX设备地址转换缓存方法,其中该步骤3包括:Described DAX device address translation cache method, wherein this step 3 comprises:

步骤31、若该直接访问地址为0,地址转换函数将映射文件的首地址填入该映射文件首地址寄存器,并向DAX地址转换缓存写0,DAX地址转换缓存收到写请求后,通过替换算法将文件编号寄存器和映射文件首地址寄存器内数据填入地址转换表中。Step 31. If the direct access address is 0, the address conversion function fills the first address of the mapping file into the first address register of the mapping file, and writes 0 to the DAX address translation cache. After the DAX address translation cache receives the write request, it replaces The algorithm fills the data in the file number register and the first address register of the mapping file into the address conversion table.

所述的DAX设备地址转换缓存方法,其中还包括:The DAX device address translation caching method further includes:

步骤4、将物理地址发送至高速缓冲存储器,将该高速缓冲存储器中物理地址对应的数据作为响应结果,判断该直接访问地址是否有效,若是,则将直接访问地址反馈给CPU,否则将该响应结果反馈给CPU。Step 4. Send the physical address to the cache memory, and use the data corresponding to the physical address in the cache memory as the response result to determine whether the direct access address is valid. If so, the direct access address is fed back to the CPU, otherwise the response The result is fed back to the CPU.

所述的DAX设备地址转换缓存方法,其中该地址转换表为32个寄存器对。In the described DAX device address translation caching method, the address translation table is 32 register pairs.

本发明还提供了一种DAX设备地址转换缓存系统,其中包括:The present invention also provides a DAX device address translation cache system, which includes:

模块1、构建由映射文件首地址寄存器MFA、对象偏移寄存器OFS、文件编号寄存器FID和地址转换表构成的DAX地址转换缓存;Module 1. Build a DAX address translation cache consisting of the first address register MFA of the mapping file, the object offset register OFS, the file number register FID and the address translation table;

模块2、根据地址转换函数,将持久化地址内文件编号和持久化地址内对象偏移分别写入该文件编号寄存器和该对象偏移寄存器;Module 2. According to the address conversion function, write the file number in the persistent address and the object offset in the persistent address into the file number register and the object offset register respectively;

模块3、快表将CPU发出的虚拟地址转换为物理地址,DAX地址转换缓存将通过该文件编号寄存器内存储的数据检索该地址转换表,将检索结果对应的首地址和对象偏移寄存器内数据相加,得到直接访问地址,并将该直接访问地址作为该虚拟地址的转换结果反馈给CPU。Module 3. The fast table converts the virtual address sent by the CPU into a physical address, and the DAX address translation cache will retrieve the address translation table through the data stored in the file number register, and retrieve the first address corresponding to the retrieval result and the data in the object offset register Add up to obtain the direct access address, and feed the direct access address to the CPU as the conversion result of the virtual address.

所述的DAX设备地址转换缓存系统,其中该模块3包括:Described DAX equipment address translation cache system, wherein this module 3 comprises:

模块31、若该直接访问地址为0,地址转换函数将映射文件的首地址填入该映射文件首地址寄存器,并向DAX地址转换缓存写0,DAX地址转换缓存收到写请求后,通过替换算法将文件编号寄存器和映射文件首地址寄存器内数据填入地址转换表中。Module 31. If the direct access address is 0, the address conversion function fills the first address of the mapping file into the first address register of the mapping file, and writes 0 to the DAX address translation cache. After the DAX address translation cache receives the write request, it replaces The algorithm fills the data in the file number register and the first address register of the mapping file into the address conversion table.

所述的DAX设备地址转换缓存系统,其中还包括:The DAX device address translation cache system further includes:

模块4、将物理地址发送至高速缓冲存储器,将该高速缓冲存储器中物理地址对应的数据作为响应结果,判断该直接访问地址是否有效,若是,则将直接访问地址反馈给CPU,否则将该响应结果反馈给CPU。Module 4. Send the physical address to the cache memory, and use the data corresponding to the physical address in the cache memory as the response result to determine whether the direct access address is valid. If so, the direct access address is fed back to the CPU, otherwise the response The result is fed back to the CPU.

所述的DAX设备地址转换缓存系统,其中该地址转换表为32个寄存器对。In the described DAX device address translation cache system, the address translation table is 32 register pairs.

由以上方案可知,本发明的优点在于:As can be seen from the above scheme, the advantages of the present invention are:

该发明可将地址转换函数的指令开销减少一半,并极大增强其处理多映射文件的效率。The invention can reduce the instruction overhead of the address conversion function by half, and greatly enhance the efficiency of processing multi-map files.

附图说明Description of drawings

图1为地址转换缓存结构图;Figure 1 is an address translation cache structure diagram;

图2为CPU、TLB和Cache之间的连接关系图;Figure 2 is a connection diagram between CPU, TLB and Cache;

图3为本发明结构图;Fig. 3 is the structure diagram of the present invention;

图4为本发明效果对比图。Fig. 4 is the effect comparison diagram of the present invention.

具体实施方式Detailed ways

发明人在进行对地址转换函数效率的研究时,发现现有技术中该项缺陷是由过多的冗余指令导致的。这些过多的指令来自条件分支,冗余的地址加载,安全性检查等。这些冗余指令的目的在于维护一个简单的缓存,该缓存暂时储存着最近访问的映射文件的首地址。During the research on the efficiency of the address translation function, the inventor found that the defect in the prior art is caused by too many redundant instructions. These excessive instructions come from conditional branches, redundant address loads, safety checks, etc. The purpose of these redundant instructions is to maintain a simple cache that temporarily stores the first address of the most recently accessed mapped file.

显然,软件上无法维护较大的缓存,否则查找效率将极其低下;而缓存的有效性检查也带来了较多的冗余指令。考虑到这些指令的目的在于实现地址转换,而目前的计算机体系结构中,已经包含了用于加速地址转换的快表,因此可以考虑将该过程交由硬件完成。但在设计上需要满足几个要求:(1)对目前的计算机体系结构改动尽可能小,不能引起数据通路过大的变化,最好能避免数据通路的变化(2)尽可能避免添加新指令,否则该发明的实际价值将大打折扣,(3)便于使用,不能让期望通过该器件实现性能提升的开发人员重写过多的代码。Obviously, it is impossible to maintain a large cache in software, otherwise the search efficiency will be extremely low; and the validity check of the cache also brings more redundant instructions. Considering that the purpose of these instructions is to implement address translation, and the current computer architecture already includes a fast table for accelerating address translation, it can be considered that the process is completed by hardware. However, several requirements need to be met in the design: (1) The current computer architecture should be changed as little as possible, and the data path should not be changed too much. It is best to avoid the change of the data path (2) Avoid adding new instructions as much as possible. , otherwise the actual value of the invention will be greatly reduced, (3) it is easy to use, and developers who expect to achieve performance improvement through the device cannot rewrite too much code.

借鉴快表的结构,本发明设计了DAX地址转换缓存。藉由该缓存,地址转换函数的指令数可减少一半,且该器件将极大提高地址转换函数处理多个映射文件的性能,因为硬件上实现并行查找是极为高效的。Drawing on the structure of the fast table, the present invention designs a DAX address translation cache. With this cache, the instruction count of the address translation function can be reduced by half, and the device will greatly improve the performance of the address translation function in processing multiple mapped files, because parallel lookups are extremely efficient in hardware.

本发明主要发明点包括:The main inventive points of the present invention include:

关键点1,考虑到硬件性能和功耗间的平衡,DAX地址转换缓存由32对寄存器对和3个独立寄存器构成。寄存器对称为地址转换表(Address Translation Table),三个独立寄存器分别称为MFA寄存器(映射文件首地址寄存器),OFS寄存器(对象偏移寄存器)和FID寄存器(文件编号寄存器)。寄存器对中存储着映射文件的编号和该映射文件的首地址,三个独立寄存器可寄存器对进行数据传输;Key point 1, considering the balance between hardware performance and power consumption, the DAX address translation cache consists of 32 pairs of registers and 3 independent registers. The register pair is called Address Translation Table, and the three independent registers are called MFA register (mapped file first address register), OFS register (object offset register) and FID register (file number register). The number of the mapping file and the first address of the mapping file are stored in the register pair, and three independent registers can perform data transmission on the register pair;

关键点2,地址转换函数需要显式地通过访存指令访问DAX地址转换缓存。由于本发明不希望破坏目前计算机体系结构中已有的数据通路,因此需要地址转换函数显式地访问DAX地址转换缓存,以获取所需的首地址。这样的做法一方面可以避免添加DAX地址转换缓存时,降低已有数据同时的效率,影响已有系统的性能;另一方面也可以避免添加额外指令,或修改已有指令。唯一需要改变的,就是在操作系统内注册四个虚拟地址,将这四个虚拟地址映射至DAX地址转换缓存即可。鉴于目前各架构芯片中都保留有一定大小的保留地址,因此上述操作并不复杂;Key point 2, the address translation function needs to explicitly access the DAX address translation cache through the memory fetch instruction. Since the present invention does not want to destroy the existing data paths in the current computer architecture, the address translation function needs to explicitly access the DAX address translation cache to obtain the required first address. On the one hand, this approach can avoid reducing the efficiency of existing data at the same time and affecting the performance of the existing system when adding a DAX address translation cache; on the other hand, it can also avoid adding additional instructions or modifying existing instructions. The only thing that needs to be changed is to register four virtual addresses in the operating system and map these four virtual addresses to the DAX address translation cache. In view of the fact that there are reserved addresses of a certain size in the current architecture chips, the above operations are not complicated;

关键点3,DAX地址转换缓存应负责检查地址的有效性,文件编号为0,偏移量为0皆属于非法地址;Key point 3, the DAX address translation cache should be responsible for checking the validity of the address. The file number is 0 and the offset is 0, which are illegal addresses;

关键点4,DAX地址转换缓存可写不可读,以避免恶意程序通过对DAX的读操作,非法获得无访问权限的映射文件的地址。The key point 4 is that the DAX address translation cache can be writable and unreadable, so as to prevent malicious programs from illegally obtaining the address of the mapping file without access rights by reading the DAX operation.

为让本发明的上述特征和效果能阐述的更明确易懂,下文特举实施例,并配合说明书附图作详细说明如下。In order to make the above-mentioned features and effects of the present invention more clearly and comprehensible, embodiments are given below, and detailed descriptions are given below in conjunction with the accompanying drawings.

DAX地址转换缓存结构:DAX address translation cache structure:

地址转换缓存结构如图1所示。在该图中,三个独立寄存器可单向得将数据送往地址转换表,OFS寄存器和地址转换表中的地址寄存器可向一个加法器发送数据,该加法器生成的结果为DAX地址转换缓存转换出的虚拟地址。The address translation cache structure is shown in Figure 1. In this figure, three independent registers can unidirectionally send data to the address translation table, the OFS register and the address register in the address translation table can send data to an adder, and the result generated by the adder is the DAX address translation buffer. The translated virtual address.

DAX地址转换缓存位置:DAX address translation cache location:

现代计算机体系结构中,CPU,TLB(快表)和Cache之间的连接关系如图2所示。当执行访存指令时,指令生成的虚拟地址由CPU送往TLB,在命中的情况下,TLB将进行地址转换,产生物理地址。该物理地址将被送往Cache,一旦命中,高速缓冲存储器Cache内数据将传输至CPU,完成访存。若未命中,则将生成的物理地址送往内存总线,并进一步交由内存控制器,最后送往DRAM,完成数据读取。其中Cache对程序员是完全透明的,且Cache是对DRAM进行缓存。动态随机存取存储器DRAM本身是用来容纳指令和数据的。In modern computer architecture, the connection relationship between CPU, TLB (fast table) and Cache is shown in Figure 2. When the memory access instruction is executed, the virtual address generated by the instruction is sent to the TLB by the CPU. In the case of a hit, the TLB will perform address translation to generate a physical address. The physical address will be sent to the Cache. Once hit, the data in the cache will be transferred to the CPU to complete the memory access. If it is not hit, the generated physical address will be sent to the memory bus, further handed over to the memory controller, and finally sent to the DRAM to complete the data read. The Cache is completely transparent to the programmer, and the Cache is used to cache the DRAM. Dynamic random access memory DRAM itself is used to hold instructions and data.

DAX地址转换缓存应置于TLB与Cache之间,并将DAX地址转换缓存编址至CPU的保留地址中。TLB完成地址转换后,生成的物理地址直接送往DAX地址转换缓存和Cache,如果物理地址是对DAX地址转换缓存的访问,则DAX地址转换缓存应进行响应,将数据传输至CPU。否则由Cache将数据传输至CPU,或报错。因此,在DAX地址转换缓存和Cache之间应设置仲裁逻辑,DAX地址转换缓存的响应拥有更高优先级,应当优先传输DAX地址转换缓存发送的数据。The DAX address translation cache should be placed between the TLB and the Cache, and the DAX address translation cache should be addressed to the reserved address of the CPU. After the TLB completes the address translation, the generated physical address is directly sent to the DAX address translation cache and the Cache. If the physical address is an access to the DAX address translation cache, the DAX address translation cache should respond and transmit the data to the CPU. Otherwise, the Cache will transfer the data to the CPU, or an error will be reported. Therefore, an arbitration logic should be set between the DAX address translation cache and the Cache. The response of the DAX address translation cache has a higher priority, and the data sent by the DAX address translation cache should be transmitted first.

地址转换函数是开发人员编写的,地址转换函数利用DAX缓存来提高函数的执行速度。通常这个函数需要查询一个软件维护的cache,然后决定如何进行地址转换。本发明相当于把软件维护的cache挪到了硬件上。地址转换函数应执行以下流程:The address translation function is written by the developer, and the address translation function utilizes the DAX cache to improve the execution speed of the function. Usually this function needs to query a software-maintained cache and then decide how to do the address translation. The present invention is equivalent to moving the cache maintained by the software to the hardware. The address translation function should perform the following flow:

1.写FID寄存器,将持久化地址内文件编号写入该寄存器。文件编号不得为0,除此之外没有特殊要求。因此各开发库可自由选择如何生成文件编号。其中,地址内文件编号由上层开发者决定,这个文件编号只是一个64bit的整型。目前Intel开发的PMDK就是用的文件编号+文件内偏移的方式来管理对象的。这里的FID就是PMDK里的文件编号,OFS就是文件内偏移。1. Write the FID register, and write the file number in the persistent address into this register. The file number must not be 0, other than that there are no special requirements. Therefore, each development library can freely choose how to generate file numbers. Among them, the file number in the address is determined by the upper-level developer, and the file number is just a 64-bit integer. At present, the PMDK developed by Intel uses the file number + offset in the file to manage objects. The FID here is the file number in the PMDK, and the OFS is the offset within the file.

2.写OFS寄存器,将持久化地址内对象偏移写入该寄存器。对象偏移不得为0,除此之外没有特殊要求。2. Write the OFS register, and write the object offset in the persistent address into this register. Object offset must not be 0, otherwise there are no special requirements.

3.读DAX地址转换缓存内地址转换表。此时DAX地址转换缓存将通过FID内存储的数据检索地址转换表,如果发现匹配项,则将其对应的首地址和OFS寄存器内数据相加,将所得结果作为对地址转换函数读请求的响应。3. Read the address translation table in the DAX address translation cache. At this time, the DAX address translation cache will retrieve the address translation table through the data stored in the FID. If a match is found, the corresponding first address will be added to the data in the OFS register, and the result will be used as the response to the read request of the address translation function. .

4.检查所读数据是否为0,若不为0,地址转换结束;若为0,执行下一步。4. Check whether the read data is 0, if it is not 0, the address conversion is over; if it is 0, go to the next step.

5.写MFA寄存器,将映射文件的首地址填入该寄存器。编程的时候,访问NVM的第一步就是进行文件映射,这个时候就可以拿到映射文件的首地址。5. Write the MFA register and fill the register with the first address of the mapping file. When programming, the first step to access NVM is to perform file mapping, and at this time, you can get the first address of the mapped file.

6.向DAX地址转换缓存写0。DAX地址转换缓存收到写请求后,通过替换算法将FID和MFA内数据填入地址转换表中。6. Write 0 to the DAX address translation cache. After the DAX address translation cache receives the write request, it fills the FID and MFA data into the address translation table through the replacement algorithm.

7.地址转换函数结束。7. The address conversion function ends.

DTLB(Direct access TLB)与Cache之间的仲裁:Arbitration between DTLB (Direct access TLB) and Cache:

前文以及,当TLB完成地址转换后,所得物理地址应当同时送往DTLB与Cache,并对DTLB和Cache的响应进行仲裁,优先将DTLB的响应传送至CPU。图3展示了为完成这样的仲裁,应当采用的硬件结构。As mentioned above, when the TLB completes the address translation, the obtained physical address should be sent to the DTLB and the Cache at the same time, and the responses of the DTLB and the Cache should be arbitrated, and the response of the DTLB should be preferentially transmitted to the CPU. Figure 3 shows the hardware structure that should be used to accomplish such an arbitration.

测评。由于目前向CPU中添加该组件是不切实际的,因此测评采用模拟的方式进行。本发明选用gem5模拟器。该模拟器模拟了不同架构CPU,包括X86,ARM等。提供了全系统模拟和系统中断模拟两种模式。由于本发明在用户态工作,因此不需要运行操作系统,故使用系统中断模拟模式。assessment. Since it is currently impractical to add this component to the CPU, the evaluation is done in a simulated manner. The present invention selects gem5 simulator. The simulator simulates different architecture CPUs, including X86, ARM, etc. Two modes of full system simulation and system interrupt simulation are provided. Since the present invention works in the user mode, there is no need to run the operating system, so the system interrupt simulation mode is used.

在测试中,本发明使用Intel开发维护的PMDK库内的pmemobj_direct地址转换函数和自行编写的调用DAX地址转换缓存的地址转换函数进行比较,分别在使用单一内存池和多个内存池的条件下,对800万持久化对象进行地址转换。各自耗时(单位:秒)如图4所示。In the test, the present invention uses the pmemobj_direct address conversion function in the PMDK library developed and maintained by Intel to compare with the self-written address conversion function that calls the DAX address translation cache. Under the conditions of using a single memory pool and multiple memory pools, respectively, Address translation of 8 million persistent objects. The respective time-consuming (unit: second) is shown in FIG. 4 .

对现有系统造成的影响:Impact on existing systems:

为了评估增加DAX地址转换缓存对现有系统将造成何种影响,需要对已有的计算机系统中各个组件的性能做出评估。In order to assess how the addition of the DAX address translation cache will affect the existing system, it is necessary to evaluate the performance of the various components in the existing computer system.

目前,TLB可在1时钟周期内完成响应,Cache可在5个时钟周期内完成响应。则理论上当指令完成译码,进入执行阶段后,数据最快在6个时钟周期后才可送往CPU。目前At present, the TLB can complete the response in 1 clock cycle, and the Cache can complete the response in 5 clock cycles. In theory, when the instruction is decoded and enters the execution stage, the data can be sent to the CPU after 6 clock cycles at the earliest. Currently

可查找的资料显示,一级Cache命中率高达95%,配合二级Cache,可实现97%命中率。因此可以估算访存平均延迟为9个时钟周期。若将DAX地址转换缓存之间添加至TLB与Cache之间,每次访存都将增加一个时钟周期,以仲裁是否将物理地址传输至Cache,使得普通访存指令遭受额外的1时钟周期延迟,性能下降约20%。因此,本文在设计中强调,TLB应当将物理地址同时送往Cache和DAX地址转换缓存,并通过仲裁逻辑选择响应访存请求的数据,而不能按序送往DAX地址转换缓存,而后送往Cache。The available data shows that the hit rate of the first-level cache is as high as 95%, and with the second-level cache, a 97% hit rate can be achieved. Therefore, it can be estimated that the average latency of memory access is 9 clock cycles. If the DAX address translation cache is added between the TLB and the Cache, each memory access will increase by one clock cycle to arbitrate whether to transfer the physical address to the Cache, so that ordinary memory access instructions suffer an additional 1 clock cycle delay. Performance drops by about 20%. Therefore, this paper emphasizes in the design that the TLB should send the physical address to the Cache and the DAX address translation cache at the same time, and select the data that responds to the memory access request through the arbitration logic, instead of sending it to the DAX address translation cache in sequence, and then to the Cache. .

以下为与上述方法实施例对应的系统实施例,本实施方式可与上述实施方式互相配合实施。上述实施方式中提到的相关技术细节在本实施方式中依然有效,为了减少重复,这里不再赘述。相应地,本实施方式中提到的相关技术细节也可应用在上述实施方式中。The following are system embodiments corresponding to the foregoing method embodiments, and this implementation manner may be implemented in cooperation with the foregoing implementation manners. The related technical details mentioned in the foregoing embodiment are still valid in this embodiment, and are not repeated here in order to reduce repetition. Correspondingly, the relevant technical details mentioned in this embodiment can also be applied to the above-mentioned embodiments.

本发明还提供了一种DAX设备地址转换缓存系统,其中包括:The present invention also provides a DAX device address translation cache system, which includes:

模块1、构建由映射文件首地址寄存器MFA、对象偏移寄存器OFS、文件编号寄存器FID和地址转换表构成的DAX地址转换缓存;Module 1. Build a DAX address translation cache consisting of the first address register MFA of the mapping file, the object offset register OFS, the file number register FID and the address translation table;

模块2、根据地址转换函数,将持久化地址内文件编号和持久化地址内对象偏移分别写入该文件编号寄存器和该对象偏移寄存器;Module 2. According to the address conversion function, write the file number in the persistent address and the object offset in the persistent address into the file number register and the object offset register respectively;

模块3、快表将CPU发出的虚拟地址转换为物理地址,DAX地址转换缓存将通过该文件编号寄存器内存储的数据检索该地址转换表,将检索结果对应的首地址和对象偏移寄存器内数据相加,得到直接访问地址,并将该直接访问地址作为该虚拟地址的转换结果反馈给CPU。Module 3. The fast table converts the virtual address sent by the CPU into a physical address, and the DAX address translation cache will retrieve the address translation table through the data stored in the file number register, and retrieve the first address corresponding to the retrieval result and the data in the object offset register Add up to obtain the direct access address, and feed the direct access address to the CPU as the conversion result of the virtual address.

所述的DAX设备地址转换缓存系统,其中该模块3包括:Described DAX equipment address translation cache system, wherein this module 3 comprises:

模块31、若该直接访问地址为0,地址转换函数将映射文件的首地址填入该映射文件首地址寄存器,并向DAX地址转换缓存写0,DAX地址转换缓存收到写请求后,通过替换算法将文件编号寄存器和映射文件首地址寄存器内数据填入地址转换表中。Module 31. If the direct access address is 0, the address conversion function fills the first address of the mapping file into the first address register of the mapping file, and writes 0 to the DAX address translation cache. After the DAX address translation cache receives the write request, it replaces The algorithm fills the data in the file number register and the first address register of the mapping file into the address conversion table.

所述的DAX设备地址转换缓存系统,其中还包括:The DAX device address translation cache system further includes:

模块4、将物理地址发送至高速缓冲存储器,将该高速缓冲存储器中物理地址对应的数据作为响应结果,判断该直接访问地址是否有效,若是,则将直接访问地址反馈给CPU,否则将该响应结果反馈给CPU。Module 4. Send the physical address to the cache memory, and use the data corresponding to the physical address in the cache memory as the response result to determine whether the direct access address is valid. If so, the direct access address is fed back to the CPU, otherwise the response The result is fed back to the CPU.

所述的DAX设备地址转换缓存系统,其中该地址转换表为32个寄存器对。In the described DAX device address translation cache system, the address translation table is 32 register pairs.

Claims (8)

1. A DAX device address translation caching method is characterized by comprising the following steps:
step 1, constructing a DAX address translation cache consisting of a mapping file initial address register (MFA), an object offset register (OFS), a file number register (FID) and an address translation table;
step 2, writing the file number in the persistent address and the object offset in the persistent address into the file number register and the object offset register respectively according to the address conversion function;
and 3, converting the virtual address sent by the CPU into a physical address by the fast table, searching the address conversion table by data stored in the file number register by the DAX address conversion cache, adding a first address corresponding to a search result and data in the object offset register to obtain a direct access address, and feeding back the direct access address serving as the conversion result of the virtual address to the CPU.
2. The DAX device address translation caching method of claim 1, wherein the step 3 comprises:
step 31, if the direct access address is 0, the address translation function fills the first address of the mapping file into the first address register of the mapping file, and writes 0 into the DAX address translation cache, and after the DAX address translation cache receives the write request, the data in the file number register and the first address register of the mapping file is filled into the address translation table through a replacement algorithm.
3. The DAX device address translation caching method of claim 1, further comprising:
and 4, sending the physical address to a cache memory, taking data corresponding to the physical address in the cache memory as a response result, judging whether the direct access address is effective, if so, feeding the direct access address back to the CPU, and otherwise, feeding the response result back to the CPU.
4. The DAX device address translation caching method of claim 1, wherein the address translation table is 32 register pairs.
5. A DAX device address translation cache system, comprising:
the module 1 is used for constructing a DAX address translation cache consisting of a mapping file initial address register (MFA), an object offset register (OFS), a file number register (FID) and an address translation table;
the module 2 writes the file number in the persistent address and the object offset in the persistent address into the file number register and the object offset register respectively according to the address conversion function;
the module 3, the fast table converts the virtual address sent by the CPU into the physical address, the DAX address translation cache retrieves the address translation table through the data stored in the file number register, adds the first address corresponding to the retrieval result and the data in the object offset register to obtain the direct access address, and feeds back the direct access address as the translation result of the virtual address to the CPU.
6. The DAX device address translation cache system of claim 5, wherein the module 3 comprises:
if the direct access address is 0, the address translation function fills the first address of the mapping file into the first address register of the mapping file, and writes 0 into the DAX address translation cache, and after the DAX address translation cache receives the write request, the data in the file number register and the first address register of the mapping file are filled into the address translation table through a replacement algorithm.
7. The DAX device address translation cache system of claim 5, further comprising:
and the module 4 sends the physical address to a cache memory, takes data corresponding to the physical address in the cache memory as a response result, judges whether the direct access address is effective, if so, feeds the direct access address back to the CPU, and otherwise, feeds the response result back to the CPU.
8. The DAX device address translation cache system of claim 5, wherein the address translation table is 32 register pairs.
CN202010357810.8A 2020-04-29 2020-04-29 DAX equipment address conversion caching method and system Active CN111651379B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010357810.8A CN111651379B (en) 2020-04-29 2020-04-29 DAX equipment address conversion caching method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010357810.8A CN111651379B (en) 2020-04-29 2020-04-29 DAX equipment address conversion caching method and system

Publications (2)

Publication Number Publication Date
CN111651379A true CN111651379A (en) 2020-09-11
CN111651379B CN111651379B (en) 2023-09-12

Family

ID=72346609

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010357810.8A Active CN111651379B (en) 2020-04-29 2020-04-29 DAX equipment address conversion caching method and system

Country Status (1)

Country Link
CN (1) CN111651379B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040250053A1 (en) * 2000-08-09 2004-12-09 Mcgrath Kevin J. Multiple entry points for system call instructions
CN101609429A (en) * 2009-07-22 2009-12-23 大唐微电子技术有限公司 A kind of method and apparatus of debugging embedded operating system
CN102495132A (en) * 2011-12-13 2012-06-13 东北大学 Multi-channel data acquisition device for submarine pipeline magnetic flux leakage internal detector
CN102929796A (en) * 2012-06-01 2013-02-13 杭州中天微系统有限公司 Memory management module simultaneously supporting software backfilling and hardware backfilling
US9058284B1 (en) * 2012-03-16 2015-06-16 Applied Micro Circuits Corporation Method and apparatus for performing table lookup
CN105740168A (en) * 2016-01-23 2016-07-06 中国人民解放军国防科学技术大学 Fault-tolerant directory cache controller
CN106940815A (en) * 2017-02-13 2017-07-11 西安交通大学 A kind of programmable convolutional neural networks Crypto Coprocessor IP Core
CN108959125A (en) * 2018-07-03 2018-12-07 中国人民解放军国防科技大学 Storage access method and device supporting rapid data acquisition

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040250053A1 (en) * 2000-08-09 2004-12-09 Mcgrath Kevin J. Multiple entry points for system call instructions
CN101609429A (en) * 2009-07-22 2009-12-23 大唐微电子技术有限公司 A kind of method and apparatus of debugging embedded operating system
CN102495132A (en) * 2011-12-13 2012-06-13 东北大学 Multi-channel data acquisition device for submarine pipeline magnetic flux leakage internal detector
US9058284B1 (en) * 2012-03-16 2015-06-16 Applied Micro Circuits Corporation Method and apparatus for performing table lookup
CN102929796A (en) * 2012-06-01 2013-02-13 杭州中天微系统有限公司 Memory management module simultaneously supporting software backfilling and hardware backfilling
CN105740168A (en) * 2016-01-23 2016-07-06 中国人民解放军国防科学技术大学 Fault-tolerant directory cache controller
CN106940815A (en) * 2017-02-13 2017-07-11 西安交通大学 A kind of programmable convolutional neural networks Crypto Coprocessor IP Core
CN108959125A (en) * 2018-07-03 2018-12-07 中国人民解放军国防科技大学 Storage access method and device supporting rapid data acquisition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
沙行勉 等: ""面向同驻虚拟机的高效共享内存文件系统"", vol. 42, no. 4, pages 800 - 819 *

Also Published As

Publication number Publication date
CN111651379B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
US7490217B2 (en) Design structure for selecting memory busses according to physical memory organization information stored in virtual address translation tables
US10705972B2 (en) Dynamic adaptation of memory page management policy
US7539842B2 (en) Computer memory system for selecting memory buses according to physical memory organization information stored in virtual address translation tables
Seshadri et al. RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization
KR102645481B1 (en) Trace recording by logging inflows into lower-tier caches based on entries in higher-tier caches
US11526441B2 (en) Hybrid memory systems with cache management
US9037903B2 (en) Apparatus and method for partial memory mirroring
CN115934584A (en) Memory access tracker in device private memory
JP2000250810A (en) Method, processor and system for executing load instruction
KR102268601B1 (en) Processor for data forwarding, operation method thereof and system including the same
EP3534265A1 (en) Memory access technique
US20140189192A1 (en) Apparatus and method for a multiple page size translation lookaside buffer (tlb)
CN112148641A (en) System and method for tracking physical address accesses by a CPU or device
Kumar et al. Survey on various advanced technique for cache optimization methods for RISC based system architecture
CN111742303B (en) Apparatus and method for accessing metadata when debugging a device
US6338128B1 (en) System and method for invalidating an entry in a translation unit
CN115618336A (en) Cache and operation method thereof, computer device
US6862675B1 (en) Microprocessor and device including memory units with different physical addresses
US10817433B2 (en) Page tables for granular allocation of memory pages
CN111651379B (en) DAX equipment address conversion caching method and system
CN115080464B (en) Data processing method and data processing device
CN115269199B (en) Data processing method, device, electronic device and computer-readable storage medium
CN110147670A (en) Persistence method for protecting EMS memory between a kind of process working in kernel state
WO2023241655A1 (en) Data processing method, apparatus, electronic device, and computer-readable storage medium
US7519792B2 (en) Memory region access management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant