[go: up one dir, main page]

CN100377115C - Stack cache memory and buffer storage method suitable for context switching - Google Patents

Stack cache memory and buffer storage method suitable for context switching Download PDF

Info

Publication number
CN100377115C
CN100377115C CNB2005100868602A CN200510086860A CN100377115C CN 100377115 C CN100377115 C CN 100377115C CN B2005100868602 A CNB2005100868602 A CN B2005100868602A CN 200510086860 A CN200510086860 A CN 200510086860A CN 100377115 C CN100377115 C CN 100377115C
Authority
CN
China
Prior art keywords
stack
stack cache
address
space
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2005100868602A
Other languages
Chinese (zh)
Other versions
CN1963789A (en
Inventor
郇丹丹
胡伟武
李祖松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Loongson Technology Corp Ltd
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CNB2005100868602A priority Critical patent/CN100377115C/en
Publication of CN1963789A publication Critical patent/CN1963789A/en
Application granted granted Critical
Publication of CN100377115C publication Critical patent/CN100377115C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

This invention discloses one shed high buffer memory and its buffer memory method suitable for upper and down text switch, which comprises at least two shed high speed buffer blocks; one or gate circuit, one selector, wherein, the shed buffer memory block is composed of label part, data part and control part; the said shed high speed buffer block is composed of at least three comparing circuits and one and gate circuit. The method comprises the following steps: a, initiating shed; b, aligning shed space; c, recycling shed space; comparing label to determine whether to shot of shed high buffer.

Description

适用于上下文切换的栈高速缓冲存储器及缓冲存储方法 Stack cache memory and buffer storage method suitable for context switching

技术领域 technical field

本发明涉及微处理器体系结构技术领域,特别涉及一种适用于上下文切换的栈高速缓冲存储器及缓冲存储方法。The invention relates to the technical field of microprocessor architecture, in particular to a stack cache memory and a buffer storage method suitable for context switching.

背景技术 Background technique

随着微处理器设计和生产工艺的快速发展,内存的访问速度与处理器的运算速度的差距越来越显著。并且存储器的访问速度与处理器的运算速度的差距以每年50%的速度增长,使访存速度越来越成为提高处理器性能的瓶颈。利用局部性原理,采用一级或多级高速缓冲存储器(Cache,简称“高速缓存”)是提高存储系统性能的有效手段之一。高速缓存是一个容量小速度快的特殊存储器,存放处理器最近使用的指令和数据。处理器运行时,如果访问的指令或数据已经在高速缓存中,则可以以很高的速度访问,否则需要访问内存,等待较长的时间。设计高效的高速缓存可以大幅度降低处理器的平均访存时间。With the rapid development of microprocessor design and production technology, the gap between memory access speed and processor operation speed is becoming more and more significant. Moreover, the gap between the access speed of the memory and the operation speed of the processor increases at a rate of 50% every year, making the memory access speed more and more a bottleneck for improving the performance of the processor. Utilizing the principle of locality, using one or more levels of cache memory (Cache, referred to as "cache") is one of the effective means to improve the performance of the storage system. A cache is a small, fast special memory that holds instructions and data recently used by the processor. When the processor is running, if the accessed instruction or data is already in the cache, it can be accessed at a very high speed, otherwise it needs to access the memory and wait for a long time. Designing an efficient cache can significantly reduce the processor's average memory access time.

为了更好的利用局部性原理,提高高速缓存命中率,高速缓存又进一步分为指令高速缓存(Instruction Cache)和数据高速缓存(Data Cache)。内存的访问分为代码段、数据段、堆空间和栈空间,这就为数据高速缓存的进一步细化提供了可能。其中栈空间数据的访问具有很好的时间局部性和空间局部性,程序连续地访问接近栈顶的数据。函数调用时局部变量的保存、参数传递、寄存器的保存和恢复都是通过栈空间的访问完成的。栈高速缓冲存储器(Stack Cache,简称“栈高速缓存”)把栈访问从数据高速缓存中分离出来,可以更好的利用栈空间访问的特点,同时避免栈数据替换出数据高速缓存中的堆数据引起数据高速缓存污染,并且减少了对数据高速缓存端口的占用。栈空间数据访问的特点:(1)栈空间数据的访问具有很好的时间局部性和空间局部性,程序连续地访问接近栈顶的数据。因此栈高速缓存不需要很大,就能够有很高的命中率。(2)栈空间分配,即栈顶指针(sp)减小,不需要从低层存储系统中取回对应块原来的值。(3)栈空间回收,即栈顶指针(sp)增加,不需要把回收值(即使是脏的)写到低层存储系统中。(4)栈访问是连续空间的访问,连续的空间可以使用相同的基地址加上偏移的形式访问。In order to make better use of the principle of locality and improve the cache hit rate, the cache is further divided into an instruction cache (Instruction Cache) and a data cache (Data Cache). Memory access is divided into code segment, data segment, heap space, and stack space, which makes it possible to further refine the data cache. Among them, the access of stack space data has good time locality and space locality, and the program continuously accesses the data close to the top of the stack. When the function is called, the saving of local variables, parameter passing, saving and restoring of registers are all done through the access of stack space. Stack cache memory (Stack Cache, referred to as "stack cache") separates stack access from data cache, which can make better use of the characteristics of stack space access, while preventing stack data from replacing heap data in the data cache Causes data cache pollution and reduces usage of data cache ports. The characteristics of stack space data access: (1) The access of stack space data has good time locality and space locality, and the program continuously accesses the data close to the top of the stack. Therefore, the stack cache does not need to be very large to have a high hit rate. (2) Stack space allocation, that is, the stack top pointer (sp) decreases, and there is no need to retrieve the original value of the corresponding block from the low-level storage system. (3) The stack space is reclaimed, that is, the stack top pointer (sp) is increased, and there is no need to write the reclaimed value (even if it is dirty) to the low-level storage system. (4) Stack access is access to continuous space, and continuous space can be accessed in the form of the same base address plus offset.

如图1为传统的栈高速缓存结构及访问图。为了快速地得到栈高速缓存是否命中的信息和命中的数据,栈高速缓存采用虚地址访问的方式。栈高速缓存分为标志、数据和控制部件三部分,输入是访问栈高速缓存的地址,输出是栈高速缓存命中或不命中的信号和命中数据。栈高速缓存的标志部分包括虚基址(Vbase)域、有效位(Valid)域、物理基址(Pbase)域、栈顶地址(Top)域和栈底地址(Bottom)域。栈高速缓存的数据部分包括数据(Data)域和表示数据是否被写过的脏位(W)域。由于微处理器的栈都是从高地址向低地址分配,因此栈底地址大于栈顶地址。栈底地址由软件约定,即由二进制程序接口标准(ABI)确定。如图1所示,控制部件包括第一比较电路、第二比较电路和与门电路。第一比较电路完成访问的基地址与栈高速缓存的虚基址是否相等的判断,即判断Base=Vbase?。第二比较电路完成访问地址是否属于栈空间的判断,即判断Top≤Vaddr≤Bottom?。与门电路完成第一比较电路输出、第二比较电路输出与有效位的与操作,确定栈高速缓存是否命中,输出栈高速缓存命中或不命中的信号。Figure 1 is a traditional stack cache structure and access diagram. In order to quickly obtain the information of whether the stack cache hits and the hit data, the stack cache uses a virtual address access method. The stack cache is divided into three parts: flag, data and control parts. The input is the address to access the stack cache, and the output is the signal and hit data of the stack cache hit or miss. The flag part of the stack cache includes a virtual base address (Vbase) field, a valid bit (Valid) field, a physical base address (Pbase) field, a stack top address (Top) field and a stack bottom address (Bottom) field. The data part of the stack cache includes a data (Data) field and a dirty bit (W) field indicating whether the data has been written. Since the stack of the microprocessor is allocated from high address to low address, the address of the bottom of the stack is greater than the address of the top of the stack. The bottom address of the stack is agreed by software, that is, determined by the binary program interface standard (ABI). As shown in FIG. 1 , the control unit includes a first comparison circuit, a second comparison circuit and an AND gate circuit. The first comparison circuit completes the judgment of whether the base address of the access is equal to the virtual base address of the stack cache, that is, judges whether Base=Vbase? . The second comparison circuit completes the judgment of whether the access address belongs to the stack space, that is, judges whether Top≤Vaddr≤Bottom? . The AND gate circuit completes the AND operation between the output of the first comparison circuit, the output of the second comparison circuit and the effective bit, determines whether the stack cache is hit, and outputs a signal of a stack cache hit or miss.

访问栈高速缓存的地址分成固定的两个部分:基地址(Base)和偏移(Offset)。其中基地址用来与栈高速缓存的虚基址比较判断栈高速缓存访问是否命中。偏移用来选择数据域中需要的内容,输出栈高速缓存命中数据。The address for accessing the stack cache is divided into two fixed parts: the base address (Base) and the offset (Offset). The base address is used to compare with the virtual base address of the stack cache to determine whether the stack cache access is hit. The offset is used to select the required content in the data field, and output the stack cache hit data.

栈高速缓存组织为循环队列的形式,缓存连续的区域。栈高速缓存通过检测栈顶指针的变化来判断栈空间的分配和回收。栈空间分配,即栈顶指针(sp)减小,直接在栈高速缓存中分配栈空间,不需要从低层存储系统中取回对应块原来的值。如果栈高速缓存中的空间不够分配新的栈空间,为保证栈高速缓存的连续性,替换出栈高速缓存中最接近栈底的数据。栈空间回收,即栈顶指针(sp)增加,不需要把脏的回收值写到低层存储系统中。The stack cache is organized in the form of a circular queue, caching contiguous regions. The stack cache judges the allocation and recovery of stack space by detecting the change of the top pointer of the stack. Stack space allocation, that is, the stack top pointer (sp) is reduced, and the stack space is directly allocated in the stack cache without retrieving the original value of the corresponding block from the low-level storage system. If the space in the stack cache is not enough to allocate a new stack space, in order to ensure the continuity of the stack cache, replace the data closest to the bottom of the stack in the stack cache. Stack space reclamation, that is, the stack top pointer (sp) is increased, and there is no need to write dirty reclaimed values to the low-level storage system.

栈高速缓存访问,通过进行标志比较,确定栈高速缓存是否命中。栈高速缓存标志比较,需要判断访问地址是否满足属于栈空间,即访问地址大于等于栈顶地址小于等于栈底地址,访问的基地址与栈高速缓存标志中的虚基址相同,标志中的有效位域的值为1。同时满足上述条件,则栈高速缓存命中。用偏移索引数据域得到需要的数据。Stack cache access, by performing a flag comparison to determine whether the stack cache is hit. To compare the stack cache flag, it is necessary to judge whether the access address belongs to the stack space, that is, the access address is greater than or equal to the address of the top of the stack and less than or equal to the address of the bottom of the stack, the base address of the access is the same as the virtual base address in the stack cache flag, and the effective The value of the bit field is 1. If the above conditions are met at the same time, the stack cache hits. Use the offset index data field to get the required data.

栈高速缓存不命中,如果访问地址不属于栈空间,由数据高速缓存处理。栈高速缓存不命中,访问地址属于栈空间,如果访问地址小于当前栈高速缓存中的最小地址值,即小于最接近栈顶的地址,则从低层存储系统中取回从栈高速缓存中的最小地址到不命中地址的数据。栈高速缓存不命中,访问地址属于栈空间,如果访问地址大于当前栈高速缓存中的最大地址值,即大于最接近栈底的地址,则从低层存储系统中取回从栈高速缓存中的最大地址到不命中地址的数据。这样保证栈高速缓存的连续性。A stack cache miss, if the access address does not belong to the stack space, is handled by the data cache. The stack cache misses, and the access address belongs to the stack space. If the access address is less than the minimum address value in the current stack cache, that is, less than the address closest to the top of the stack, the minimum address in the stack cache is retrieved from the lower storage system. address to miss address data. The stack cache misses, and the access address belongs to the stack space. If the access address is greater than the maximum address value in the current stack cache, that is, greater than the address closest to the bottom of the stack, the maximum address in the stack cache is retrieved from the lower storage system. address to miss address data. This ensures the continuity of the stack cache.

在进行上下文切换时,由于栈高速缓存中没有记录进程(包括线程)的标识信息,新的进程与原来的进程可能分配相同的虚地址,导致它们的物理地址不同但虚地址相同。因此,在上下文切换时为保证数据一致性,需要将栈高速缓存中脏的数据都写到低层存储系统中,释放出空间给新的进程使用,参见美国专利No.6167488,专利名称“Stack caching circuit with overflow/underflow unit”。即使栈高速缓存中的空间足够新的进程使用,为保证正确性也必须要把脏的数据写到低层存储系统中。When performing context switching, since the identification information of the process (including thread) is not recorded in the stack cache, the new process and the original process may be assigned the same virtual address, resulting in different physical addresses but the same virtual address. Therefore, in order to ensure data consistency during context switching, it is necessary to write all dirty data in the stack cache to the low-level storage system to release space for new processes. See US Patent No. 6167488, patent name "Stack caching circuit with overflow/underflow unit". Even if there is enough space in the stack cache for newer processes, dirty data must be written to the underlying storage system for correctness.

在处理器运行单进程应用时,栈高速缓存表现出很好的性能,但运行多用户(Multi-user)、多道程序(Multi-programming)和多线程(Multi-threading),由于上下文切换需要一次性地把栈高速缓存中所有脏的数据写到低层存储系统中,开销很大。在上下文切换回来以后,还需要把写出去的数据再取回栈高速缓存,数据传输的代价很大。因此,具有栈高速缓存的处理器在单进程应用中具有很好的性能,而运行在多进程(包括线程)环境下,特别是频繁进行上下文切换时,效果并不理想。多用户、多道程序和多线程的应用是微处理器发展的趋势,是不可避免的。When the processor runs a single-process application, the stack cache shows good performance, but when running multi-user, multi-programming and multi-threading, due to the need for context switching Writing all dirty data in the stack cache to the low-level storage system at once is very expensive. After the context is switched back, the written data needs to be retrieved to the stack cache, and the cost of data transmission is very high. Therefore, a processor with a stack cache has good performance in a single-process application, but the effect is not ideal when running in a multi-process (including thread) environment, especially when context switching is frequently performed. The application of multi-user, multi-program and multi-thread is the development trend of microprocessor, which is inevitable.

因此,现有技术的不足就需要设计适用于上下文切换的微处理器栈高速缓存以降低处理器的平均访存时间,大幅度提高微处理器在实际应用中的访存性能。Therefore, the deficiencies of the prior art require the design of a microprocessor stack cache suitable for context switching to reduce the average memory access time of the processor and greatly improve the memory access performance of the microprocessor in practical applications.

发明内容 Contents of the invention

本发明目的在于克服现有技术的栈高速缓存不适用于上下文切换的不足,从而提供一种在多用户、多道程序及多线程环境下,遇到频繁的上下文切换,也能够很好的发挥作用,并且硬件开销小,易于实现的适用于上下文切换的微处理器栈高速缓存和方法。The purpose of the present invention is to overcome the deficiency that the stack cache in the prior art is not suitable for context switching, thereby providing a stack cache that can be used well when encountering frequent context switching in a multi-user, multi-program and multi-thread environment. Effect, and the hardware overhead is small, and the microprocessor stack cache and method suitable for context switching are easy to implement.

为了达到上述目的,本发明采取技术方案如下:In order to achieve the above object, the present invention takes technical scheme as follows:

一种适用于上下文切换的栈高速缓冲存储器,包括:A stack cache suitable for context switching comprising:

两个或两个以上栈高速缓存块,所述栈高速缓存块由标志部分、数据部分和控制部分组成;Two or more stack cache blocks, the stack cache block is composed of a flag part, a data part and a control part;

一个或门电路,与所述栈高速缓存块的控制部分的输出端连接,用于各个栈高速缓存块命中信号的或操作,输出该栈高速缓冲存储器命中或不命中的结果;An OR gate circuit, connected to the output end of the control part of the stack cache block, is used for the OR operation of each stack cache block hit signal, and outputs the result of the stack cache memory hit or miss;

一个选择器,与所述栈高速缓存块的控制部分的输出端连接,并与所述栈高速缓存块的数据部分的输出端连接,用于选择命中的栈高速缓存块的数据,输出栈高速缓存命中数据;a selector connected to the output of the control part of the stack cache block and connected to the output of the data part of the stack cache block for selecting the data of the hit stack cache block and outputting the stack cache block Cache hit data;

进一步地,所述栈高速缓存块的标志部分包括虚基址(Vbase)域、有效位(Valid)域、物理基址(Pbase)域、栈顶地址(Top)域、栈底地址(Bottom)域、进程地址空间标识(PASID,简称“进程标识”)域;Further, the flag part of the stack cache block includes a virtual base address (Vbase) field, a valid bit (Valid) field, a physical base address (Pbase) field, a stack top address (Top) field, a stack bottom address (Bottom) domain, process address space identification (PASID, referred to as "process identification") domain;

进一步地,所述栈高速缓存块的数据部分包括数据(Data)域和表示数据是否被写过的脏位(W)域;Further, the data part of the stack cache block includes a data (Data) field and a dirty bit (W) field representing whether the data has been written;

进一步地,每个所述栈高速缓存块的控制部分包括:至少三个比较电路和一个与门电路;其中,第一比较电路的输入端与标志部分的虚基址域和该栈高速缓存的访问地址的基地址域连接,用于完成访问的基地址与标志部分的虚基址是否相等的判断,其输出端连接至所述与门电路;第二比较电路的输入端与标志部分的栈顶地址域和栈底地址域以及该栈高速缓存的访问地址连接,用于完成访问地址是否属于栈空间的判断,其输出端连接至所述与门电路;第三比较电路的输入端与所述标志部分的进程地址空间标识域和控制寄存器的进程地址空间标识域连接,用于完成进程地址空间标识域的值与访问指令的进程地址空间标识是否相等的判断,其输出端连接至所述与门电路;所述标志部分的有效位域也连接至所述与门电路的输入端;所述与门电路的输出端分别连接到所述或门电路和所述选择器;所述控制寄存器的内容是进程地址空间标识域;所述的控制寄存器与栈高速缓冲存储器的第三比较电路的输入端直接相连。Further, the control part of each stack cache block includes: at least three comparison circuits and an AND gate circuit; wherein, the input end of the first comparison circuit is connected to the virtual base address domain of the flag part and the stack cache The base address field connection of the access address is used to complete the judgment of whether the base address of the access is equal to the virtual base address of the sign part, and its output terminal is connected to the AND gate circuit; the input terminal of the second comparison circuit is connected to the stack of the sign part The top address domain is connected with the stack bottom address domain and the access address of the stack cache, and is used to complete the judgment of whether the access address belongs to the stack space, and its output terminal is connected to the AND gate circuit; the input terminal of the third comparison circuit is connected to the The process address space identification field of the above flag part is connected with the process address space identification field of the control register, and is used to complete the judgment whether the value of the process address space identification field is equal to the process address space identification of the access instruction, and its output terminal is connected to the described AND gate circuit; the valid bit field of the flag part is also connected to the input terminal of the AND gate circuit; the output terminal of the AND gate circuit is respectively connected to the OR gate circuit and the selector; the control register The content of is the process address space identification field; the control register is directly connected to the input end of the third comparison circuit of the stack cache memory.

本发明提供的适用于上下文切换的微处理器栈高速缓冲存储器,其输入是访问栈高速缓存的地址和访问指令的进程地址空间标识,其输出是命中或不命中的信号和命中的数据。The microprocessor stack cache memory suitable for context switching provided by the present invention, its input is the address of the access stack cache and the process address space identifier of the access instruction, and its output is the hit or miss signal and the hit data.

在本发明中,输入访问指令的进程地址空间标识的值来自微处理器的控制寄存器,对于每一条访存指令都有相应的进程地址空间标识。在微处理器当中,都有保存进程地址空间标识相应内容的控制寄存器,只是存放的寄存器不同,存放方式不同。例如MIPS处理器采用EntryHi寄存器存放地址空间标识符ASID(Address SpaceIdentifier),EntryLow寄存器存放全局位G(Global Bit),它们共同构成进程地址空间标识。In the present invention, the value of the process address space identifier of the input access instruction comes from the control register of the microprocessor, and there is a corresponding process address space identifier for each memory access instruction. In the microprocessor, there are control registers that store the corresponding content of the process address space identifier, but the stored registers are different and the storage methods are different. For example, the MIPS processor uses the EntryHi register to store the address space identifier ASID (Address SpaceIdentifier), and the EntryLow register to store the global bit G (Global Bit), which together constitute the process address space identifier.

根据上述的微处理器栈高速缓冲存储器,一种适用于上下文切换的微处理器栈高速缓冲存储方法,包含步骤如下:According to the above-mentioned microprocessor stack cache memory, a microprocessor stack cache storage method suitable for context switching comprises the following steps:

(1)上下文切换,初始化栈;如果栈高速缓存中没有分配相应的进程栈空间,在栈高速缓存中记录初始化栈底地址和栈顶地址;(1) Context switch, initialize the stack; if the corresponding process stack space is not allocated in the stack cache, record the address of the bottom of the initialization stack and the address of the top of the stack in the stack cache;

(2)栈空间分配;如果栈高速缓存有可分配空间,在栈高速缓存中分配新的未用空间,如果栈高速缓存没有可分配空间,则选择栈高速缓存块写回到低层存储系统,并初始化新分配的栈高速缓存块的标志;(2) stack space allocation; if the stack cache has distributable space, allocate new unused space in the stack cache, if the stack cache has no distributable space, then select the stack cache block to write back to the low-level storage system, And initialize the flag of the newly allocated stack cache block;

(3)栈空间回收;不需要把脏的回收值写到低层存储系统中,直接回收释放栈高速缓存空间;(3) Stack space reclaiming; there is no need to write the dirty reclaimed value to the low-level storage system, and directly reclaim and release the stack cache space;

(4)指令访问栈高速缓存;进行标志比较,根据标志比较结果确定访问栈高速缓存是否命中;如果命中,执行步骤(5);如果不命中,执行步骤(6);(4) Instruction access stack cache; Carry out sign comparison, determine whether access stack cache hits according to sign comparison result; If hit, execute step (5); If not hit, execute step (6);

(5)输出命中的栈高速缓存块用偏移进行数据索引得到的命中数据;(5) The hit data that the stack cache block of output hit is carried out data index with offset and obtains;

(6)判断访问地址是否属于栈空间;如果访问地址不属于栈空间,由数据高速缓存处理;如果属于栈空间,从低层存储系统中取回不命中地址所在栈高速缓存块在栈底和栈顶地址之间的数据。(6) Determine whether the access address belongs to the stack space; if the access address does not belong to the stack space, it is processed by the data cache; if it belongs to the stack space, the stack cache block where the miss address is retrieved from the low-level storage system is at the bottom of the stack and the stack data between top addresses.

在上述步骤(2)中,如果栈高速缓存没有可分配空间,选择栈高速缓存块写回到低层存储系统,可以采用先进先出策略(FIFO)、随机策略(Random)或最近最少使用策略(LRU)。如果采用先进先出策略(FIFO),最先进入的栈高速缓存块的选择可以通过在栈高速缓存块的标志中增加表示栈高速缓存块分配时间的域(Age)实现,新分配的栈高速缓存块标志的Age域清零,其他栈高速缓存块标志的Age域加1。In the above step (2), if there is no space available for allocation in the stack cache, select the stack cache block to be written back to the low-level storage system, and the first-in-first-out strategy (FIFO), random strategy (Random) or least recently used strategy ( LRU). If the first-in-first-out strategy (FIFO) is adopted, the selection of the stack cache block that enters first can be realized by adding a field (Age) indicating the allocation time of the stack cache block to the sign of the stack cache block, and the newly allocated stack cache block is high-speed. The Age field of the cache block flag is cleared, and the Age field of other stack cache block flags is incremented by 1.

在上述步骤(4)中,所述标志比较是指判断是否同时满足下列条件:访问地址大于等于栈顶地址小于等于栈底地址,访问的基地址与栈高速缓存块标志中的虚基址相同,虚基址与访问基地址相同的栈高速缓存块标志中有效位域的值为1,进程地址空间标识域的值与访问指令的进程地址空间标识相同;所述访问栈高速缓存命中是指满足上述条件;如果不满足上述条件,则访问栈高速缓存不命中。In above-mentioned step (4), described mark compares and refers to judging whether to satisfy following condition simultaneously: access address is greater than or equal to stack top address and is less than or equal to stack bottom address, and the base address of access is identical with the virtual base address in the stack cache block mark , the value of the valid bit field in the stack cache block flag with the same virtual base address and the access base address is 1, and the value of the process address space identification field is the same as the process address space identification of the access instruction; the access stack cache hit means The above conditions are met; if the above conditions are not met, the access stack cache misses.

本发明具有下列优点:The present invention has the following advantages:

1.本发明栈高速缓存以块为组织形式,在栈高速缓存块标志中采用了专门的进程地址空间标识,用以区别不同进程的地址空间。所以这是一个专门针对多用户、多道程序、多线程环境,能很好的适应进程(包括线程)上下文切换的栈高速缓存方法。1. The stack cache of the present invention is organized in blocks, and a special process address space identifier is used in the stack cache block flag to distinguish the address spaces of different processes. So this is a stack cache method that is specially designed for multi-user, multi-program, and multi-thread environments, and can well adapt to process (including thread) context switching.

2.本发明只需要在栈高速缓存块的标志中增加进程地址空间标识PASID域和Age域,硬件开销小,控制简单,避免了实现的复杂性。2. The present invention only needs to add the PASID field and the Age field of the process address space identifier in the stack cache block flag, the hardware overhead is small, the control is simple, and the complexity of implementation is avoided.

附图说明 Description of drawings

图1是传统的栈高速缓存结构及访问图。Figure 1 is a traditional stack cache structure and access diagram.

图2是本发明一实施例适用于上下文切换的栈高速缓存结构及访问图。FIG. 2 is a stack cache structure and access diagram applicable to context switching according to an embodiment of the present invention.

具体实施方式 Detailed ways

下面结合附图和具体实施方式对本发明作进一步详细描述:Below in conjunction with accompanying drawing and specific embodiment the present invention is described in further detail:

图2中,数字10表示一个依照本发明的适用于上下文切换的栈高速缓冲存储器的具体实施例。该实施例中,适用于上下文切换的栈高速缓存由两个栈高速缓存块、一个或门电路11和一个选择器12组成。该实施例的输入是访问栈高速缓存的地址13和访问指令的进程地址空间标识14,输出是命中或不命中的信号和命中的数据。每个栈高速缓存块中的数据是连续的,类似于传统的栈高速缓存。In FIG. 2, numeral 10 denotes a specific embodiment of a stack cache suitable for context switching according to the present invention. In this embodiment, the stack cache suitable for context switching consists of two stack cache blocks, an OR circuit 11 and a selector 12 . The input of this embodiment is the address 13 of the access stack cache and the process address space identifier 14 of the access instruction, and the output is a hit or miss signal and hit data. The data in each stack cache block is contiguous, similar to a traditional stack cache.

两个栈高速缓存块具有同样的结构,为了便于表述称为第一和第二栈高速缓存块,其中第一栈高速缓存块包括标志部分15、数据部分16和控制部件17三部分。第一栈高速缓存块的标志部分15包括:虚基址(Vbase)域用来表示栈高速缓存块的虚基址、有效位(Valid)域用来表示栈高速缓存块是否有效、物理基址(Pbase)域用来表示栈高速缓存块虚基址对应的物理基址、栈顶地址(Top)域用来表示栈高速缓存块所属进程栈空间的栈顶地址、栈底地址(Bottom)域用来表示栈高速缓存块所属进程栈空间的栈底地址、进程地址空间标识(PASID)域用来表示栈高速缓存块所属进程的进程地址空间标识和分配时间(Age)域用来表示栈高速缓存块的分配时间。图中栈高速缓存标志部分的每一个体对应一个栈高速缓存块的标志,图中数字18表示第二栈高速缓存块的标志部分,其结构与第一栈高速缓存块的标志部分15相同。第一栈高速缓存块的数据部分16包括数据(Data)域和表示数据是否被写过的脏位(W)域。图中的栈高速缓存数据的每一个体表示一个栈高速缓存块的数据,图中数字19表示第二栈高速缓存的数据部分,其结构与第一栈高速缓存块的数据部分16相同。第一栈高速缓存块的控制部分17包括第一比较电路20、第二比较电路21、第三比较电路22和与门电路23。第一比较电路20完成访问的基地址与第一栈高速缓存块的虚基址是否相等的判断,即判断Base=Vbase?。第二比较电路21完成访问地址是否属于栈空间的判断,即判断Top≤Vaddr≤Bottom?。第三比较电路22完成进程地址空间标识域的值与访问指令的进程地址空间标识14是否相等的判断。其中访问指令的进程地址空间标识14(PASID)的值来自微处理器的控制寄存器,对于每一条访存指令都有相应的进程地址空间标识。与门电路23完成第一比较电路20输出、第二比较电路21输出、第三比较电路22输出和标志部分15的有效位域的与操作,确定第一栈高速缓存块是否命中,输出第一栈高速缓存块命中或不命中的信号。The two stack cache blocks have the same structure and are referred to as the first and second stack cache blocks for ease of expression, wherein the first stack cache block includes three parts: a flag part 15 , a data part 16 and a control part 17 . The sign part 15 of the first stack cache block includes: the virtual base address (Vbase) field is used to represent the virtual base address of the stack cache block, and the effective bit (Valid) field is used to represent whether the stack cache block is valid, the physical base address The (Pbase) domain is used to indicate the physical base address corresponding to the virtual base address of the stack cache block, the top address of the stack (Top) domain is used to indicate the top address of the stack of the process stack space to which the stack cache block belongs, and the bottom address of the stack (Bottom) domain It is used to indicate the stack bottom address of the process stack space to which the stack cache block belongs, the process address space identifier (PASID) field is used to indicate the process address space identifier of the process to which the stack cache block belongs, and the allocation time (Age) field is used to indicate the stack high speed. The allocation time of the cache block. Each entity of the stack cache mark part in the figure corresponds to a mark of a stack cache block, and number 18 in the figure represents the mark part of the second stack cache block, and its structure is the same as the mark part 15 of the first stack cache block. The data portion 16 of the first stack cache block includes a data (Data) field and a dirty bit (W) field indicating whether the data has been written. Each individual of the stack cache data in the figure represents the data of a stack cache block, and the number 19 in the figure represents the data part of the second stack cache, whose structure is the same as the data part 16 of the first stack cache block. The control section 17 of the first stack cache block includes a first comparison circuit 20 , a second comparison circuit 21 , a third comparison circuit 22 and an AND circuit 23 . The first comparison circuit 20 completes the judgment of whether the accessed base address is equal to the virtual base address of the first stack cache block, that is, judges whether Base=Vbase? . The second comparison circuit 21 completes the judgment of whether the access address belongs to the stack space, that is, judges whether Top≤Vaddr≤Bottom? . The third comparison circuit 22 completes the judgment of whether the value of the process address space identification field is equal to the process address space identification 14 of the access instruction. The value of the process address space identifier 14 (PASID) of the access instruction comes from the control register of the microprocessor, and there is a corresponding process address space identifier for each memory access instruction. The AND gate circuit 23 completes the AND operation of the output of the first comparison circuit 20, the output of the second comparison circuit 21, the output of the third comparison circuit 22 and the effective bit field of the flag part 15, determines whether the first stack cache block hits, and outputs the first Signal of a stack cache block hit or miss.

或门电路11完成各个栈高速缓存块命中信号的或操作,在本实施例中即第一和第二栈高速缓存块,输出栈高速缓存10命中或不命中的结果。The OR gate circuit 11 completes the OR operation of the hit signals of each stack cache block, namely the first and second stack cache blocks in this embodiment, and outputs the result of the stack cache 10 hit or miss.

选择器12完成选择命中的栈高速缓存块的数据,输出栈高速缓存命中数据。The selector 12 finishes selecting the data of the hit stack cache block, and outputs the stack cache hit data.

访问栈高速缓存指令的地址(Vaddr)13分成固定的两个部分:基地址(Base)和偏移(Offset)。其中基地址用来与栈高速缓存块标志中的虚基址比较判断栈高速缓存访问是否命中,偏移用来选择数据域中需要的内容。处理器带有进程标识信息的控制寄存器(Control Register)指定访问栈高速缓存指令的进程地址空间标识14,用来与栈高速缓存块标志中的进程地址空间标识进行比较,识别访问指令所在的进程。The address (Vaddr) 13 of the instruction to access the stack cache is divided into two fixed parts: a base address (Base) and an offset (Offset). Wherein the base address is used to compare with the virtual base address in the stack cache block mark to judge whether the stack cache access hits, and the offset is used to select the required content in the data field. The control register (Control Register) of the processor with process identification information specifies the process address space identification 14 of the access stack cache instruction, which is used to compare with the process address space identification in the stack cache block identification to identify the process where the access instruction is located .

在本实施例中,以两个栈高速缓存块为例来说明本发明的栈高速缓存,应当理解,依照本发明的栈高速缓存可以包含多个栈高速缓存块,其都具有同样的结构和连接方式,这对于本领域技术人员来说是可以胜任的。In this embodiment, two stack cache blocks are taken as an example to illustrate the stack cache of the present invention. It should be understood that the stack cache according to the present invention may include multiple stack cache blocks, all of which have the same structure and Connection method, which is competent for those skilled in the art.

依照本实施例提供的栈高速缓冲存储器,一种适用于上下文切换的微处理器栈高速缓冲存储方法,具体实施步骤如下:According to the stack cache memory provided in this embodiment, a microprocessor stack cache storage method suitable for context switching, the specific implementation steps are as follows:

(1)上下文切换,初始化栈;如果栈高速缓存中没有分配对应的进程栈空间,在栈高速缓存中记录进程的初始化栈底地址和栈顶地址。如果栈高速缓存中已经分配对应的进程栈空间,表示是原有的进程上下文切换回来,不需要做任何操作。(1) Context switch, initialize the stack; if there is no corresponding process stack space allocated in the stack cache, record the initialization stack bottom address and stack top address of the process in the stack cache. If the corresponding process stack space has been allocated in the stack cache, it means that the original process context is switched back, and no operation is required.

(2)栈空间分配,即栈顶指针减小;如果栈高速缓存有可分配空间,在栈高速缓存中分配新的未用空间。如果栈高速缓存没有可分配空间,则按照先进先出策略(FIFO)选择最先进入的栈高速缓存块写回到低层存储系统,释放栈高速缓存块分配给新的进程使用,并初始化新分配的栈高速缓存块标志。最先进入的栈高速缓存块的选择通过在栈高速缓存块的标志中增加分配时间(Age)域实现,Age值最大的栈高速缓存块为最先进入的栈高速缓存块。新分配栈高速缓存块标志的Age域清零,其他栈高速缓存块标志的Age域加1。如果需要分配栈空间的进程已经分配栈高速缓存块,修改该进程所在栈高速缓存块标志中的栈顶指针,将新分配栈空间对应的W域清零。如果栈顶地址不在该进程已经分配的栈高速缓存块中,新分配一个栈顶地址所在的栈高速缓存块。如果需要分配栈空间的进程没有分配栈高速缓存块,为其分配新的栈顶地址所在的栈高速缓存块。新分配的栈高速缓存块标志中的Vbase由栈顶的基地址设置,Pbase由栈顶基地址对应的物理基址设置,进程地址空间标识域置为当前的进程地址空间标识,栈底地址(Bottom)和栈顶地址(Top)设为初始化栈底地址和当前栈顶地址,有效位域(Valid)置为1,栈高速缓存块数据的W域初始化为0。(2) Stack space allocation, that is, the pointer at the top of the stack decreases; if the stack cache has allocatable space, allocate new unused space in the stack cache. If there is no allocated space in the stack cache, the first-in-first-out stack cache block is selected to be written back to the low-level storage system according to the first-in-first-out strategy (FIFO), and the stack cache block is released to be allocated to a new process, and the new allocation is initialized The stack cache block flag. The selection of the stack cache block that enters first is realized by adding an allocation time (Age) field in the sign of the stack cache block, and the stack cache block with the largest Age value is the stack cache block that enters first. The Age field of the newly allocated stack cache block flag is cleared, and the Age field of other stack cache block flags is incremented by 1. If the process that needs to allocate stack space has already allocated a stack cache block, modify the stack top pointer in the stack cache block flag where the process is located, and clear the W field corresponding to the newly allocated stack space. If the address of the top of the stack is not in the stack cache block already allocated by the process, a new stack cache block where the address of the top of the stack is located is allocated. If the process that needs to allocate stack space has not allocated a stack cache block, allocate the stack cache block where the address of the new top of the stack is located. Vbase in the newly distributed stack cache block sign is set by the base address of the top of the stack, Pbase is set by the physical base address corresponding to the top of the stack base address, the process address space identification domain is set as the current process address space identification, and the bottom address of the stack ( Bottom) and stack top address (Top) are set as initializing stack bottom address and current stack top address, effective bit field (Valid) is set to 1, and the W domain initialization of stack cache block data is 0.

(3)栈空间回收,即栈顶指针增加;不需要把脏的回收值写到低层存储系统中,直接回收释放栈高速缓存空间,修改栈顶指针。如果栈空间全部回收,即栈顶指针等于栈底指针的值,则置回收栈空间对应栈高速缓存标志的有效位域(Valid)的值为0。(3) Stack space reclamation, that is, the stack top pointer is increased; there is no need to write the dirty recovery value to the low-level storage system, directly reclaim and release the stack cache space, and modify the stack top pointer. If the stack space is all reclaimed, that is, the top pointer of the stack is equal to the value of the bottom pointer of the stack, then the value of the valid bit field (Valid) corresponding to the stack cache flag of the reclaimed stack space is 0.

(4)指令访问栈高速缓存,进行标志比较,根据标志比较结果确定栈高速缓存是否命中。进行标志比较,即判断是否满足:(a)访问地址大于等于栈顶地址小于等于栈底地址。(b)访问的基地址与栈高速缓存块标志中的虚基址相同。(c)虚基址与访问基地址相同的栈高速缓存块标志中有效位(Valid)域的值为1。(d)基地址相同栈高速缓存块标志中的进程地址空间标识域的值与访问指令的进程地址空间标识相同。同时满足上述条件,则栈高速缓存命中,执行步骤(5);否则执行步骤(6);(4) The instruction accesses the stack cache, performs flag comparison, and determines whether the stack cache hits according to the flag comparison result. Perform flag comparison, that is, judge whether to satisfy: (a) the access address is greater than or equal to the top address of the stack and less than or equal to the bottom address of the stack. (b) The base address of the access is the same as the virtual base address in the stack cache block tag. (c) The value of the valid bit (Valid) field in the stack cache block flag whose virtual base address is the same as the access base address is 1. (d) The base address is the same. The value of the process address space identifier field in the stack cache block flag is the same as the process address space identifier of the access instruction. If the above conditions are met simultaneously, the stack cache hits, and step (5) is executed; otherwise, step (6) is executed;

(5)访问命中,输出命中的栈高速缓存块用偏移进行数据索引得到的命中数据。(5) Access hit, output the hit data obtained by indexing the data of the hit stack cache block with the offset.

(6)访问不命中,判断访问地址是否属于栈空间。分两种情况进行处理:(a)访问地址不属于栈空间,由数据高速缓存处理。(b)访问地址属于栈空间,从低层存储系统中取回不命中地址所在栈高速缓存块在栈底和栈顶地址之间的数据,保证栈高速缓存块是连续的。分配栈高速缓存块,此栈高速缓存块的虚基址与访问地址的基地址相同。访问低层存储系统数据返回,用基地址置栈高速缓存块标志中的Vbase,用基地址对应的物理基址置标志中的Pbase,标志中的Valid域置1,用控制寄存器中的PASID填充栈高速缓存块标志中的PASID域,Age域清零,其他栈高速缓存块的Age域的值加1。返回数据存放在栈高速缓存块数据的Data域,对应的W域置0。(6) Access miss, judge whether the access address belongs to the stack space. It is processed in two cases: (a) The access address does not belong to the stack space and is processed by the data cache. (b) The access address belongs to the stack space, and the data of the stack cache block where the miss address is located is retrieved from the low-level storage system between the bottom and top addresses of the stack to ensure that the stack cache blocks are continuous. Allocates a stack cache block whose virtual base address is the same as the base address of the access address. Access the low-level storage system data return, set the Vbase in the stack cache block flag with the base address, set the Pbase in the flag with the physical base address corresponding to the base address, set the Valid field in the flag to 1, and fill the stack with the PASID in the control register The PASID field and the Age field in the cache block flag are cleared to zero, and the values of the Age field of other stack cache blocks are increased by 1. The returned data is stored in the Data field of the stack cache block data, and the corresponding W field is set to 0.

下面列举三个具体的实施例子。通过栈空间的分配回收、栈高速缓存访问命中和栈高速缓存访问不命中的例子来具体说明如何通过本发明提到的栈高速缓存以块为组织形式,在栈高速缓存块标志中增加进程地址空间标识域,区分不同进程所占用的空间,实现适用于上下文切换的栈高速缓存方法。Three specific implementation examples are listed below. Through the examples of distribution and recovery of stack space, stack cache access hit and stack cache access miss, how to increase the process address in the stack cache block mark by the stack cache mentioned in the present invention with blocks as the organizational form The space identification domain distinguishes the space occupied by different processes and implements a stack cache method suitable for context switching.

例1.栈空间分配,虚地址为32位,栈底地址为0x7fff8000,栈顶地址减小为0x7fff7c00,控制寄存器中的进程地址空间标识PASID为进程8,栈高速缓存块大小为4KB,每个栈高速缓存块标志的Vbase为20位,0ffset为12位,栈高速缓存大小为16KB,分为4个栈高速缓存块。栈高速缓存中没有进程号为8的栈高速缓存块,并且没有标志的有效位为0的栈高速缓存块。需要分配的空间为1KB(0x7fff8000-0x7fff7c00),查找Age最大的栈高速缓存块,第二个栈高速缓存块的Age最大为8,进程号为1,栈顶地址为0x7fff7b00,栈底地址为0x7fff8000,Vbase为0x7fff7,Valid为1,Pbase为0x00ff7。替换出该栈高速缓存块,从进程1的栈顶开始到栈底,即偏移为0xb00到0xfff,如果索引到的数据域对应项W为1即为脏,把对应项写到低层存储系统中,写回后置W域的值为0,写回地址为Pbase拼上Offset。如Offset为0xb00的数据是脏的,Pbase为0x00ff7,写回地址则为00ff7b00。栈高速缓存块大小为4KB,大于等于需要分配的空间,足够分配给新的进程栈空间使用。进程8进入第二个栈高速缓存块,标志的Vbase置为0x7fff7,Valid置为1,PASID置为控制寄存器中的PASID即8,Pbase置为该虚基址对应的物理基址0x01ff7,Bottom置为0x7fff8000,Top置为7fff7c00,Age置为0。其他栈高速缓存块的Age加1。栈空间回收时,栈顶地址增加为0x7fff7f00,不需要把栈高速缓存中的对应的数据写到低层存储系统,只需要把栈顶改为0x7fff7f00。Example 1. Stack space allocation, the virtual address is 32 bits, the stack bottom address is 0x7fff8000, the stack top address is reduced to 0x7fff7c00, the process address space identifier PASID in the control register is process 8, the stack cache block size is 4KB, each The Vbase of the stack cache block sign is 20 bits, 0ffset is 12 bits, and the stack cache size is 16KB, which is divided into 4 stack cache blocks. There is no stack cache block with process number 8 in the stack cache, and no stack cache block with flag effective bit 0. The space to be allocated is 1KB (0x7fff8000-0x7fff7c00). Find the stack cache block with the largest Age. The maximum Age of the second stack cache block is 8, the process number is 1, the address of the top of the stack is 0x7fff7b00, and the address of the bottom of the stack is 0x7fff8000 , Vbase is 0x7fff7, Valid is 1, and Pbase is 0x00ff7. Replace the stack cache block, starting from the top of the stack of process 1 to the bottom of the stack, that is, the offset is 0xb00 to 0xfff, if the corresponding item W of the indexed data field is 1, it is dirty, and write the corresponding item to the low-level storage system , the value of the write-back W field is 0, and the write-back address is Pbase with Offset. For example, the data whose Offset is 0xb00 is dirty, the Pbase is 0x00ff7, and the write-back address is 00ff7b00. The size of the stack cache block is 4KB, which is greater than or equal to the space that needs to be allocated, which is enough for the new process stack space to be allocated. Process 8 enters the second stack cache block, the Vbase of the mark is set to 0x7fff7, Valid is set to 1, PASID is set to the PASID in the control register, which is 8, Pbase is set to the physical base address 0x01ff7 corresponding to the virtual base address, and Bottom is set to It is 0x7fff8000, Top is set to 7fff7c00, and Age is set to 0. The Age of other stack cache blocks is incremented by 1. When the stack space is reclaimed, the address of the top of the stack is increased to 0x7fff7f00. It is not necessary to write the corresponding data in the stack cache to the low-level storage system, but only to change the top of the stack to 0x7fff7f00.

例2.访存指令的地址Vaddr为0x7fff7f80,控制寄存器中的进程地址空间标识PASID为进程7,栈高速缓存块大小为4KB,每个标志栈高速缓存块的Vbase为20位,Offset为12位,栈高速缓存大小为16KB,分为4个栈高速缓存块。访问地址的基地址(Base)为0x7fff7,偏移(Offset)为0xf80。栈高速缓存中的第一个栈高速缓存块的虚基址(Vbase)为0x7fff7,进程地址空间标识(PASID)为7,有效位(Valid)为1,栈底地址(Bottom)为0x7fff8000,栈顶地址(Top)为0x7fff7400,物理基址(Pbase)为0x1ff7,Age为0。第一个栈高速缓存块中0xf80偏移索引的数据为0x01fc00c0。进行标志比较,访问地址大于等于栈顶地址小于等于栈底地址(0x7fff7400≤0x7fff7f80≤0x7fff8000),访问基地址等于第一个栈高速缓存块标志的虚基址0x7fff7f80,第一个栈高速缓存块标志中有效位域的值为1,且进程地址空间标识域的值与访问指令的进程地址空间标识相同为7。满足栈高速缓存命中的条件,返回命中信息。选择数据域中第一个栈高速缓存块用偏移索引数据域得到的数据0x01fc00c0,输出命中数据0x01fc00c0。Example 2. The address Vaddr of the memory access instruction is 0x7fff7f80, the process address space identifier PASID in the control register is process 7, the stack cache block size is 4KB, the Vbase of each flag stack cache block is 20 bits, and the Offset is 12 bits , the stack cache size is 16KB, divided into 4 stack cache blocks. The base address (Base) of the access address is 0x7fff7, and the offset (Offset) is 0xf80. The virtual base address (Vbase) of the first stack cache block in the stack cache is 0x7fff7, the process address space identifier (PASID) is 7, the valid bit (Valid) is 1, the bottom address of the stack (Bottom) is 0x7fff8000, the stack The top address (Top) is 0x7fff7400, the physical base address (Pbase) is 0x1ff7, and the Age is 0. The data indexed at offset 0xf80 in the first stack cache block is 0x01fc00c0. Perform flag comparison, the access address is greater than or equal to the stack top address and less than or equal to the stack bottom address (0x7fff7400≤0x7fff7f80≤0x7fff8000), the access base address is equal to the virtual base address of the first stack cache block mark 0x7fff7f80, the first stack cache block mark The value of the valid bit field in is 1, and the value of the process address space identification field is 7 which is the same as the process address space identification of the access instruction. If the condition of stack cache hit is satisfied, the hit information is returned. Select the data 0x01fc00c0 obtained by using the offset index data field of the first stack cache block in the data field, and output the hit data 0x01fc00c0.

例3.访存指令的地址Vaddr为0x7fff7f80,控制寄存器中的进程地址空间标识PASID为进程7,栈高速缓存块大小为4KB,每个栈高速缓存块标志的Vbase为20位,Offset为12位,栈高速缓存大小为16KB,分为4个栈高速缓存块。访问地址的基地址(Base)为0x7fff7,偏移(Offset)为0xf80。栈高速缓存中的第一个栈高速缓存块标志的虚基址(Vbase)为0x7fff6,进程地址空间标识(PASID)为7,有效位(Valid)为1,栈底地址(Bottom)为0x7fff8000,栈顶地址(Top)为0x7fff6000,物理基址(Pbase)为0x1ff6,Age为2。访问地址大于等于栈顶地址小于等于栈底地址(0x7fff7400≤0x7fff7f80≤0x7fff8000),标志中有效位域的值为1,标志中的进程地址空间标识域的值与访问指令的进程标识相同为7。但访问基地址不等于第一个栈高速缓存块的虚基址0x7fff6,没有其他栈高速缓存块的进程地址空间标识为7,第三个栈高速缓存块标志的有效位为0。栈高速缓存不命中,访问地址访问属于栈空间,从低层存储系统中取回不命中地址所在栈高速缓存块在栈顶地址和栈底地址之间的数据进入第三个栈高速缓存块,保证栈高速缓存块是连续的。即取回从虚地址0x7fff7000到0x7fff7fff的数据,对应物理地址为0x01ff7000到0x01ff7fff。访问低层存储系统数据返回,置第三个栈高速缓存块标志中的Vbase为0x7fff7,置Pbase为对应物理基址0x01ff7,置Valid域为1,用控制寄存器中的地址空间标识7填充PASID,置Age域为0,其他栈高速缓存块Age域的值加1。返回数据存放在第三个栈高速缓存块数据的Data域位置,对应的W域置为0。Example 3. The address Vaddr of the memory access instruction is 0x7fff7f80, the process address space identifier PASID in the control register is process 7, the stack cache block size is 4KB, the Vbase of each stack cache block flag is 20 bits, and the Offset is 12 bits , the stack cache size is 16KB, divided into 4 stack cache blocks. The base address (Base) of the access address is 0x7fff7, and the offset (Offset) is 0xf80. The virtual base address (Vbase) of the first stack cache block flag in the stack cache is 0x7fff6, the process address space identifier (PASID) is 7, the valid bit (Valid) is 1, and the stack bottom address (Bottom) is 0x7fff8000, The stack top address (Top) is 0x7fff6000, the physical base address (Pbase) is 0x1ff6, and the Age is 2. The access address is greater than or equal to the stack top address and less than or equal to the stack bottom address (0x7fff7400≤0x7fff7f80≤0x7fff8000), the value of the effective bit field in the flag is 1, and the value of the process address space identification field in the flag is the same as the process identifier of the access instruction, which is 7. But the access base address is not equal to the virtual base address 0x7fff6 of the first stack cache block, the process address space flag without other stack cache blocks is 7, and the effective bit of the third stack cache block flag is 0. When the stack cache misses, the access address belongs to the stack space, and the data between the stack top address and the stack bottom address of the stack cache block where the miss address is retrieved from the low-level storage system enters the third stack cache block, ensuring Stack cache blocks are contiguous. That is, the data from the virtual address 0x7fff7000 to 0x7fff7fff is retrieved, and the corresponding physical address is 0x01ff7000 to 0x01ff7fff. Access the low-level storage system data and return, put the Vbase in the third stack cache block mark as 0x7fff7, put Pbase as the corresponding physical base address 0x01ff7, put the Valid field as 1, fill the PASID with the address space identifier 7 in the control register, set The Age field is 0, and the value of the Age field of other stack cache blocks is increased by 1. The returned data is stored in the Data field of the third stack cache block data, and the corresponding W field is set to 0.

通过上述实施例的描述,本发明的优点是明显的。本发明克服了传统栈高速缓存方法不适用于进程(包括线程)上下文切换的不足,可行性好。Through the description of the above embodiments, the advantages of the present invention are obvious. The invention overcomes the disadvantage that the traditional stack cache method is not suitable for process (including thread) context switching, and has good feasibility.

最后应说明的是:以上实施例仅用以说明而非限制本发明的技术方案,尽管参照上述实施例对本发明进行了详细说明,本领域的普通技术人员应当理解:依然可以对本发明进行修改或者等同替换,而不脱离本发明的精神和范围的任何修改或局部替换,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that: the above embodiments are only used to illustrate and not limit the technical solutions of the present invention, although the present invention has been described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: the present invention can still be modified or Any modification or partial replacement without departing from the spirit and scope of the present invention shall fall within the scope of the claims of the present invention.

Claims (8)

1.一种适用于上下文切换的栈高速缓冲存储器,包括:1. A stack cache suitable for context switching, comprising: 两个或两个以上栈高速缓存块,所述栈高速缓存块由标志部分、数据部分和控制部分组成;Two or more stack cache blocks, the stack cache block is composed of a flag part, a data part and a control part; 一个或门电路,与所述的栈高速缓存块的所述控制部分的输出端连接,用于各个栈高速缓存块命中信号的或操作,输出该栈高速缓冲存储器命中或不命中的结果;An OR circuit, connected to the output end of the control part of the stack cache block, used for the OR operation of the hit signal of each stack cache block, and output the result of the stack cache memory hit or miss; 一个选择器,与所述栈高速缓存块的所述控制部分的输出端连接,并与所述栈高速缓存块的所述数据部分的输出端连接,用于选择命中的栈高速缓存块的数据,输出栈高速缓存命中数据。a selector coupled to the output of said control portion of said stack cache block and coupled to the output of said data portion of said stack cache block for selecting data from a hit stack cache block , output stack cache hit data. 2.根据权利要求1所述的适用于上下文切换的栈高速缓冲存储器,其特征在于,每个所述栈高速缓存块的结构是相同的。2. The stack cache memory suitable for context switching according to claim 1, wherein the structure of each of the stack cache blocks is the same. 3.根据权利要求2所述的适用于上下文切换的栈高速缓冲存储器,其特征在于,所述栈高速缓存块的标志部分包括虚基址域、有效位域、物理基址域、栈顶地址域、栈底地址域、进程地址空间标识域。3. The stack cache memory suitable for context switching according to claim 2, wherein the flag part of the stack cache block comprises a virtual base address field, an effective bit field, a physical base address field, and a stack top address Domain, stack bottom address domain, process address space identification domain. 4.根据权利要求3所述的适用于上下文切换的栈高速缓冲存储器,其特征在于,每个所述栈高速缓存块的数据部分包括数据域和表示数据是否被写过的脏位域。4. The stack cache suitable for context switching according to claim 3, wherein the data part of each stack cache block includes a data field and a dirty bit field indicating whether the data has been written. 5.根据权利要求1-4中任一项所述的适用于上下文切换的栈高速缓冲存储器,其特征在于,每个所述栈高速缓存块的控制部分包括:至少三个比较电路和一个与门电路;其中,第一比较电路的输入端与标志部分的虚基址域和该栈高速缓存的访问地址的基地址域连接,用于完成访问的基地址与标志部分的虚基址是否相等的判断,其输出端连接至所述与门电路;第二比较电路的输入端与标志部分的栈顶地址域和栈底地址域以及该栈高速缓存的访问地址连接,用于完成访问地址是否属于栈空间的判断,其输出端连接至所述与门电路;第三比较电路的输入端与所述标志部分的进程地址空间标识域连接,微处理器的控制寄存器的进程地址空间标识域与第三比较电路的输入端连接,用于完成进程地址空间标识域的值与访问指令的进程地址空间标识是否相等的判断,其输出端连接至所述与门电路;所述标志部分的有效位域也连接至所述与门电路的输入端;所述与门电路的输出端分别连接到所述或门电路和所述选择器。5. The stack cache memory suitable for context switching according to any one of claims 1-4, wherein the control part of each said stack cache block comprises: at least three comparison circuits and an AND Gate circuit; wherein, the input terminal of the first comparison circuit is connected with the virtual base address field of the flag part and the base address field of the access address of the stack cache, whether the base address for completing the access is equal to the virtual base address of the flag part Judgment, its output terminal is connected to the described AND gate circuit; The input terminal of the second comparison circuit is connected with the stack top address field and the stack bottom address field of the flag part and the access address of the stack cache, and is used to complete whether the access address Belonging to the judgment of the stack space, its output end is connected to the described AND gate circuit; the input end of the third comparison circuit is connected with the process address space identification domain of the said sign part, and the process address space identification domain of the control register of the microprocessor is connected with the The input end of the third comparison circuit is connected, and is used to complete the judgment whether the value of the process address space identification field is equal to the process address space identification of the access instruction, and its output end is connected to the described AND circuit; domain is also connected to the input of the AND circuit; the output of the AND circuit is respectively connected to the OR circuit and the selector. 6.一种适用于上下文切换的微处理器栈高速缓冲存储方法,其步骤如下:6. A microprocessor stack cache storage method suitable for context switching, the steps are as follows: (1)上下文切换,初始化栈;如果栈高速缓存中没有分配相应的进程栈空间,在栈高速缓存中记录初始化栈底地址和栈顶地址;(1) Context switch, initialize the stack; if the corresponding process stack space is not allocated in the stack cache, record the address of the bottom of the initialization stack and the address of the top of the stack in the stack cache; (2)栈空间分配;如果栈高速缓存有可分配空间,在栈高速缓存中分配新的未用空间,如果栈高速缓存没有可分配空间,则选择栈高速缓存块写回到低层存储系统,并初始化新分配的栈高速缓存块的标志;(2) stack space allocation; if the stack cache has distributable space, allocate new unused space in the stack cache, if the stack cache has no distributable space, then select the stack cache block to write back to the low-level storage system, And initialize the flag of the newly allocated stack cache block; (3)栈空间回收;不需要把脏的回收值写到低层存储系统中,直接回收释放栈高速缓存空间;(3) Stack space reclaiming; there is no need to write the dirty reclaimed value to the low-level storage system, and directly reclaim and release the stack cache space; (4)指令访问栈高速缓存;进行标志比较,根据标志比较结果确定访问栈高速缓存是否命中;如果命中,执行步骤(5);如果不命中,执行步骤(6);(4) Instruction access stack cache; Carry out sign comparison, determine whether access stack cache hits according to sign comparison result; If hit, execute step (5); If not hit, execute step (6); (5)输出命中的栈高速缓存块用偏移进行数据索引得到的命中数据;(5) The hit data that the stack cache block of output hit is carried out data index with offset and obtains; (6)判断访问地址是否属于栈空间;如果访问地址不属于栈空间,由数据高速缓存处理;如果属于栈空间,从低层存储系统中取回不命中地址所在栈高速缓存块在栈底和栈顶地址之间的数据。(6) Determine whether the access address belongs to the stack space; if the access address does not belong to the stack space, it is processed by the data cache; if it belongs to the stack space, the stack cache block where the miss address is retrieved from the low-level storage system is at the bottom of the stack and the stack data between top addresses. 7.根据权利要求6所述适用于上下文切换的微处理器栈高速缓冲存储方法,其特征在于,在所述步骤(2)中,如果栈高速缓存没有可分配空间,选择栈高速缓存块写回到低层存储系统,采用的策略为先进先出策略、随机策略或最近最少使用策略;如果采用先进先出策略,最先进入的栈高速缓存块的选择通过在栈高速缓存块的标志中增加表示栈高速缓存块分配时间的域实现,新分配的栈高速缓存块标志的所述域清零,其他栈高速缓存块标志的所述域加1。7. according to the described microprocessor stack cache storage method that is applicable to context switching according to claim 6, it is characterized in that, in described step (2), if stack cache does not have distributable space, select stack cache block to write Returning to the low-level storage system, the strategy adopted is first-in-first-out strategy, random strategy or least-recently-used strategy; if the first-in-first-out strategy is adopted, the selection of the stack cache block that enters first is added to the flag of the stack cache block Realization of the field representing the allocation time of the stack cache block, the field of the newly allocated stack cache block flag is cleared, and the fields of other stack cache block flags are incremented by 1. 8.根据权利要求6或7所述适用于上下文切换的微处理器栈高速缓冲存储方法,其特征在于,在所述步骤(4)中,所述标志比较是指判断是否同时满足下列条件:访问地址大于等于栈顶地址,且小于等于栈底地址,访问的基地址与栈高速缓存块标志中的虚基址相同,虚基址与访问基地址相同的栈高速缓存块标志中有效位域的值为1,进程地址空间标识域的值与访问指令的进程地址空间标识相同;所述访问栈高速缓存命中是指满足上述条件;如果不满足上述条件,则访问栈高速缓存不命中。8. according to claim 6 or 7 described and be applicable to the microprocessor stack high-speed storage method of context switch, it is characterized in that, in described step (4), described sign comparison refers to judging whether to meet the following conditions simultaneously: The access address is greater than or equal to the stack top address, and less than or equal to the stack bottom address, the base address of the access is the same as the virtual base address in the stack cache block flag, and the valid bit field in the stack cache block flag with the virtual base address and the access base address is the same value of 1, the value of the process address space identification domain is the same as the process address space identification of the access instruction; the access stack cache hit means that the above conditions are met; if the above conditions are not met, the access stack cache misses.
CNB2005100868602A 2005-11-11 2005-11-11 Stack cache memory and buffer storage method suitable for context switching Active CN100377115C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100868602A CN100377115C (en) 2005-11-11 2005-11-11 Stack cache memory and buffer storage method suitable for context switching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100868602A CN100377115C (en) 2005-11-11 2005-11-11 Stack cache memory and buffer storage method suitable for context switching

Publications (2)

Publication Number Publication Date
CN1963789A CN1963789A (en) 2007-05-16
CN100377115C true CN100377115C (en) 2008-03-26

Family

ID=38082852

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100868602A Active CN100377115C (en) 2005-11-11 2005-11-11 Stack cache memory and buffer storage method suitable for context switching

Country Status (1)

Country Link
CN (1) CN100377115C (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699627B (en) * 2013-12-06 2019-05-07 上海芯豪微电子有限公司 A kind of caching system and method
CN105808576B (en) * 2014-12-30 2019-05-28 展讯通信(天津)有限公司 A kind of digital data recording system and method
WO2017049590A1 (en) * 2015-09-25 2017-03-30 Intel Corporation Systems and methods for input/output computing resource control
CN110134617A (en) * 2019-05-15 2019-08-16 上海东软载波微电子有限公司 Address space allocation method and device, computer readable storage medium
CN114840143A (en) * 2022-05-09 2022-08-02 Oppo广东移动通信有限公司 Stack space characteristic-based cache processing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6167488A (en) * 1997-03-31 2000-12-26 Sun Microsystems, Inc. Stack caching circuit with overflow/underflow unit
US20040162947A1 (en) * 2003-01-16 2004-08-19 Ip-First, Llc. Microprocessor with variable latency stack cache
CN1619511A (en) * 2004-01-16 2005-05-25 智慧第一公司 Microprocessor and apparatus for performing fast speculative load operations

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6167488A (en) * 1997-03-31 2000-12-26 Sun Microsystems, Inc. Stack caching circuit with overflow/underflow unit
US20040162947A1 (en) * 2003-01-16 2004-08-19 Ip-First, Llc. Microprocessor with variable latency stack cache
CN1619511A (en) * 2004-01-16 2005-05-25 智慧第一公司 Microprocessor and apparatus for performing fast speculative load operations
CN1632877A (en) * 2004-01-16 2005-06-29 智权第一公司 Variable latency stack cache memory and method of providing data

Also Published As

Publication number Publication date
CN1963789A (en) 2007-05-16

Similar Documents

Publication Publication Date Title
US7793049B2 (en) Mechanism for data cache replacement based on region policies
US6260114B1 (en) Computer cache memory windowing
US6427188B1 (en) Method and system for early tag accesses for lower-level caches in parallel with first-level cache
CN102819497B (en) A kind of memory allocation method, Apparatus and system
US20120159103A1 (en) System and method for providing stealth memory
CN102985910A (en) GPU support for garbage collection
CN101510176B (en) Control method of general-purpose operating system for accessing CPU two stage caching
US5765199A (en) Data processor with alocate bit and method of operation
KR20000052480A (en) System and method for cache process
CN101571835B (en) Implementation method of changing Cache group associativity based on program requirements
CN1093961C (en) Enhanced memory performace of processor by elimination of outdated lines in second-level cathe
CN101008922A (en) Segmentation and paging data storage space management method facing heterogeneous polynuclear system
CN113641596A (en) Cache management method, cache management device and processor
CN105446889A (en) Memory management method, device and memory controller
CN101694640A (en) Method for realizing replacement policies of shared second-level cache under multi-core architecture
CN112559389A (en) Storage control device, processing device, computer system, and storage control method
US7007135B2 (en) Multi-level cache system with simplified miss/replacement control
CN100377115C (en) Stack cache memory and buffer storage method suitable for context switching
US8266379B2 (en) Multithreaded processor with multiple caches
CN102880553B (en) The reading/writing method of the outer FLASH file system of a kind of sheet based on MCU
US7398371B2 (en) Shared translation look-aside buffer and method
CN100428200C (en) Method for implementing on-chip command cache
JP3964821B2 (en) Processor, cache system and cache memory
CN114116537B (en) A method and device for adaptive fusion of address and instruction cache
CN109614349B (en) Cache management method based on binding mechanism

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Assignee: Beijing Loongson Zhongke Technology Service Center Co., Ltd.

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract fulfillment period: 2009.12.16 to 2028.12.31

Contract record no.: 2010990000062

Denomination of invention: Stack cache memory applied for context switch and buffer storage method

Granted publication date: 20080326

License type: exclusive license

Record date: 20100128

LIC Patent licence contract for exploitation submitted for record

Free format text: EXCLUSIVE LICENSE; TIME LIMIT OF IMPLEMENTING CONTACT: 2009.12.16 TO 2028.12.31; CHANGE OF CONTRACT

Name of requester: BEIJING LOONGSON TECHNOLOGY SERVICE CENTER CO., LT

Effective date: 20100128

EC01 Cancellation of recordation of patent licensing contract

Assignee: Longxin Zhongke Technology Co., Ltd.

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract record no.: 2010990000062

Date of cancellation: 20141231

EM01 Change of recordation of patent licensing contract

Change date: 20141231

Contract record no.: 2010990000062

Assignee after: Longxin Zhongke Technology Co., Ltd.

Assignee before: Beijing Loongson Zhongke Technology Service Center Co., Ltd.

LICC Enforcement, change and cancellation of record of contracts on the licence for exploitation of a patent or utility model
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20070516

Assignee: Longxin Zhongke Technology Co., Ltd.

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract record no.: 2015990000066

Denomination of invention: Stack cache memory applied for context switch and buffer storage method

Granted publication date: 20080326

License type: Common License

Record date: 20150211

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200818

Address after: 100095, Beijing, Zhongguancun Haidian District environmental science and technology demonstration park, Liuzhou Industrial Park, No. 2 building

Patentee after: LOONGSON TECHNOLOGY Corp.,Ltd.

Address before: 100080 Haidian District, Zhongguancun Academy of Sciences, South Road, No. 6, No.

Patentee before: Institute of Computing Technology, Chinese Academy of Sciences

EC01 Cancellation of recordation of patent licensing contract
EC01 Cancellation of recordation of patent licensing contract

Assignee: LOONGSON TECHNOLOGY Corp.,Ltd.

Assignor: Institute of Computing Technology, Chinese Academy of Sciences

Contract record no.: 2015990000066

Date of cancellation: 20200928

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing

Patentee after: Loongson Zhongke Technology Co.,Ltd.

Address before: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing

Patentee before: LOONGSON TECHNOLOGY Corp.,Ltd.