WO2020135209A1 - Method for reducing bank conflicts - Google Patents
Method for reducing bank conflicts Download PDFInfo
- Publication number
- WO2020135209A1 WO2020135209A1 PCT/CN2019/126552 CN2019126552W WO2020135209A1 WO 2020135209 A1 WO2020135209 A1 WO 2020135209A1 CN 2019126552 W CN2019126552 W CN 2019126552W WO 2020135209 A1 WO2020135209 A1 WO 2020135209A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- memory
- data
- bank
- shift step
- shift
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
Definitions
- the invention relates to a method for reducing memory bank conflicts, which belongs to the technical field of memory.
- the processor may need to access data stored in the same memory bank in parallel.
- the address of the data in the memory is fixed, and a bank conflict will occur at this time. That is, these data located in the same memory bank must be accessed one by one.
- each column represents a memory bank, that is, the memory has 8 memory banks of Bank 0, ..., Bank 7, and if multiple addresses in the same memory bank are accessed Data, for example, click 0,8,16,24 in Bank 0, as shown in gray in Figure 3, then it must use multiple cycles to access Bank 0 one by one, resulting in reduced memory performance.
- the object of the present invention is to provide a solution for reducing memory bank conflicts in the LSU (Load/Store, Unit, Access Unit) micro-architecture design of a processor to improve memory performance.
- LSU Load/Store, Unit, Access Unit
- the first aspect of the present invention provides a method for reducing memory bank conflicts, including:
- the data in the memory is moved according to the shift step, wherein the row address value of the data in the memory within the memory bank is unchanged.
- the present invention makes data that may conflict during access no longer in the same storage body, reduces storage body conflicts, and improves memory performance.
- the shift steps of the multiple data in which the bank conflict occurs are different from each other, so that after the data in the memory is moved by the shift step, the multiple data in which the bank conflict occurs will be respectively located in different banks.
- the n bits selected from the binary address of the data in the memory may be consecutive n bits.
- the n bits selected from the binary address of the data in the memory may also be discontinuous n bits.
- the shift step size can be the value represented by consecutive n bits from the n+i or n+i+1 bit of the binary address of the data in the memory, or the data in the memory The value represented by the discontinuous n bits in the binary address of. The choice of these different shift steps depends on the specific circumstances of the memory bank conflict and the configuration of the software.
- the storage space of the memory can be divided into multiple pages, and multiple sets of registers are provided to configure the pages to determine the shift step size for each page, and the number of registers is less than or equal to the number of pages. Further, when the number of registers is smaller than the number of pages, each register can be mapped to multiple pages in a TLB manner.
- a second aspect of the present invention provides a device for reducing memory bank conflicts, including:
- the bank conflict determination unit is configured to determine a binary address of a plurality of data in which bank conflicts occur when parallel access is made to a memory containing N banks;
- the shift step determination unit is configured such that after the data in the memory is moved by its respective shift step, multiple data in which a bank conflict occurs will no longer be located in the same bank, where the shift step indicates storage
- the shift unit is configured to shift the data in the memory according to a shift step, wherein the row address value of the data in the memory within the memory bank is unchanged.
- the shift step determination unit is further configured that the shift steps of a plurality of data in which a bank conflict occurs are different from each other, so that after the data in the memory is shifted by the shift step, there are many bank conflicts The data will be located in different memory banks.
- n bits selected from the binary address of the data in the memory may be consecutive n bits.
- the n bits selected from the binary address of the data in the memory may also be discontinuous n bits, and the shift step size of the data in which the bank conflict occurs may be completely different.
- the shift step determination unit may be further configured to divide the storage space of the memory into multiple pages, and provide multiple sets of registers to configure pages to select shift steps for each page, and the number of registers is less than or equal to The number of pages. Further, when the number of registers is less than the number of pages, each register is mapped to multiple pages in a TLB manner.
- a third aspect of the present invention provides a processor.
- the processor includes a memory having a plurality of storage bodies and the device provided by the foregoing second aspect or any implementation manner of the second aspect.
- a fourth aspect of the present invention provides an electronic device.
- the electronic device may be various computers, servers, or various mobile devices, including the processor provided in the foregoing fourth aspect or any implementation manner of the fourth aspect.
- a fifth aspect of the present invention provides a computer program product, the computer program product includes program code, and when the computer program product is executed by a controller, the controller executes the foregoing first aspect or any implementation manner of the first aspect The method provided.
- the present invention performs a shift operation by simply extracting a few bits in the address of the data stored in the memory, moves the data that may conflict to different memory banks, reduces or eliminates the memory bank conflicts, and improves the memory performance.
- FIG. 1 is a flowchart of a method for reducing memory bank conflict according to an embodiment of the present invention.
- FIG. 2 is a schematic diagram of a system capable of reducing memory bank conflicts according to an embodiment of the present invention.
- FIG. 3 is a schematic diagram of a memory having eight memory banks according to an embodiment of the present invention.
- FIG. 4 is a schematic diagram of a memory after using address [5:3] to move data in a memory according to an embodiment of the present invention.
- FIG. 5 is a schematic diagram of a memory after using address [6:4] to move data in a memory according to an embodiment of the present invention.
- FIG. 6 is a schematic diagram of a memory after using values represented by bits 6, 5, and 3 of an address as shift steps to move data in a memory according to an embodiment of the present invention.
- FIG. 7 is a schematic diagram of hardware implementation of memory address reorganization according to an embodiment of the present invention.
- Memory devices such as dynamic random access memory (DRAM) or static random access memory (SRAM) may include multiple banks, and devices such as processors may independently access multiple banks.
- DRAM dynamic random access memory
- SRAM static random access memory
- DRAM dynamic random access memory
- processors may independently access multiple banks.
- a memory bank conflict occurs.
- the present invention uses address conversion to move data that may be in conflict to different memory banks, thereby reducing memory bank conflicts.
- a method for reducing memory bank conflict includes the following steps:
- step S101 a binary address of a plurality of data in which a bank conflict occurs when parallel access to a memory including multiple banks is determined. For example, as shown in FIG. 3, in the memory 300 having eight banks Bank 0, Bank 1, ..., Bank 7, under normal circumstances, data in eight banks can be accessed in parallel through 8 addresses There will be no conflict. However, if you click on multiple data stored in the same memory bank at the same time, for example, click on the four addresses 0,8,16,24 in Bank 0 at the same time, as shown in gray in Figure 3, a bank conflict will occur. When you need to access the data at 0, 8, 16, and 24 at the same time, 0, 8, 16, and 24 are the locations where memory bank conflicts will occur.
- a shift step size is determined.
- the conflicting data 0, 8, 16, 24 can be transferred in shift steps so that they are located in different memory banks.
- the value represented by the consecutive 3 bits from the third bit of the address to the higher bit direction can be used as the shift step, that is, the address [5:3] is taken to move the memory Data, that is, take the 5th to 3rd bits of the address of the data stored in the memory (in the address representation, the lowest bit of the rightmost side is usually taken as the 0th bit, and it is sequentially increased to the left. All expressed in this way), according to the three-bit value to move the data in the memory.
- the shift steps of the first line of data 0-7 are all 000, that is, the first line of data does not move; the shift steps of the second line of data 8-15 are all 001, that is, move One bit; in the same way, the shift step of the remaining data can be obtained.
- the conflicting data of the sending memory bank is moved to a different memory bank, but the row address value of the data inside the memory bank has not changed, that is, from the perspective of the figure, the data are all in the same row Medium shift, because the shift step only changes the decoding of the memory bank, and does not change the decoding of the row address inside the memory bank.
- the above shift step selection can be flexibly configured according to different data access requirements.
- there are 8 banks for the memory 300 which is also similar to FIG. 3 Memory, assuming that the data at Bank 0, 0, 16, 32, 48 conflicts, in order to move the above data to different memory banks, you can use the consecutive 3 bits from the fourth bit of the address to the higher bit direction
- the value is used as the shift step, that is, the value represented by the address [6:4] is used as the shift step to move the data in the memory.
- the shift steps of the data at 0, 16, 32, 48 are 000 , 001, 010, and 011, so that the data at 0, 16, 32, and 48 will be moved to banks Bank 0, Bank 1, Bank 2, Bank 3, respectively.
- the result after the shift is shown in Figure 5.
- the above embodiment shows a case where several consecutive bits in the address of the data stored in the memory are selected to represent the shift step size. In addition, it is also possible to select a few consecutive bits in the address as the shift step.
- the conflicting data is the data at 8, 16, 32 in Bank 0, in order to transfer the above data to In different memory banks, you can select the data consisting of the sixth, fifth, and third bits of the address as the shift step, and move the data to the banks Bank 1, Bank 0, Bank 2, respectively, and obtain the results shown in Figure 6.
- Fig. 6 only schematically shows some rows, and the remaining rows are omitted.
- the data consisting of bits 5, 4, and 2 of the address may also be selected as the shift step, or the data consisting of bits 6, 3, and 1 may be selected as the shift step, etc.
- the line spacing of the conflicting data in the present invention, the line spacing represents the difference between the line address values of the two data
- the shift step size can be selected in the following manner.
- the shift step size can be selected as the value represented by consecutive n bits from the n+i bit of the binary address of the data in the memory to the higher bit direction.
- the conflict data 0, 16, 32, 48 are moved to
- the shift step size can be the value represented by consecutive n bits from the n+i or n+i+1 bit of the binary address of the data in the memory, or the value in the memory The value represented by discrete n bits in the binary address of the data. For example, also using the example of a memory with 8 banks shown in FIG.
- n bits in the address can be selected as a shift step in various ways.
- FIGS. 3-5 Although an example of a memory including 8 memory banks is shown in FIGS. 3-5, those skilled in the art should understand that the number of memory banks and the structure of the memory shown here are only for convenience of explanation, and It does not constitute a limitation on the present invention.
- the present invention can also be applied to memories with different numbers of banks of 4, 16, 32, etc.
- addresses with different numbers of bits can be selected as shift data, for example, in In a memory with 32 banks, when a bank conflict occurs, 5 bits of the address can be selected as shift data, and the value represented by the selected 5 bits can be used to move the conflicted data to different banks, thereby reducing Memory conflict.
- N N memory banks.
- n bits can be selected from the memory address to decode the bank selection.
- this signal can be named bank_sel (ie, bank_sel is the n-bit selected from the address).
- bank_sel is the n-bit selected from the address.
- a shift step can be added to bank_sel, which can be represented by shift_sel.
- bank_sel which can be represented by shift_sel.
- shift_sel As shown in FIG. 7, for each row of such a memory with N banks, there will be N A possible shift situation (0, 1, 2, ..., N-1), that is, the value of shift_sel may be 0...N-1.
- a 5-bit field address can be defined, including bank selection (bank_sel), shift step (shift_sel), byte offset (offset), and 2-bit field index (index_h And index_l) to generate decoded signals.
- bank selection (bank_sel) and shift step (shift_sel) are used to decode bank selection, as described above; row indexes (index_h and index_l) are used to index specific rows of the bank, that is, used to index data in The row address inside the memory bank; the byte offset (offset) is used to index the offset within a memory bank. For example, if the memory bank width is 4 bytes, the byte offset (offset) should be 2 bits.
- the position of each field can be configured by software. For example, whether shift_sel and bank_sel are adjacent can be configured by software. If they are not adjacent, as shown in FIG. 7, for example, the width of index_l can also be configurable .
- the key is how to obtain the shift step shift_sel.
- a simple method is to select another n bits in the address as shift_sel according to data access requirements.
- Both the bank_sel bit and the shift_sel bit can be any n bits in the address. For example, they can be the same n bits in the address, or some bits overlap.
- the software configuration can select the n-bit in the address as shift_sel according to the application scenario. Of course, if you don't want to move, you can also configure a value that means no movement. As mentioned earlier, they can be consecutive n bits in the address or discontinuous n bits.
- Another more complex but more flexible method is to select different bits as shift_sel in different address spaces.
- the storage space of the memory may be divided into multiple pages (page), for example, divided into Page 1, Page 2, Page 3...
- Pages may have n KB (or other) space .
- a set of registers can be allocated for each page.
- the number of register groups may be smaller than the number of address pages, that is, a group of register groups may be configured to be mapped to different pages in a TLB manner, thereby saving registers.
- This method is actually a special TLB (Translation Lookaside Buffer, translation bypass buffer).
- TLB Translation Lookaside Buffer, translation bypass buffer
- Traditionally TLB is generally used to convert a virtual address to a physical address, but here, a TLB can be provided to transfer the data in the memory from one memory bank to another memory bank in different shift modes on different pages.
- the address in one page can be represented by m bits. Assuming that the entire address has k bits, other k-m bit addresses may constitute page addresses. You can define multiple sets of configuration registers, and use these multiple sets of configuration registers as TLB. When accessing the memory, the page address of the accessed address is compared with the page addresses in all configuration register groups.
- the group configuration can be used to select shift_sel; and if the same value is not found, Then the hardware will request an interrupt from the software to request the software to fill or replace the TLB, or automatically replace the TLB entry with the software pre-configured value in the second TLB.
- the software must ensure that no two or more register bank configurations correspond to the same page. In this way, a large number of page address spaces can be configured with a relatively small number of register groups, and the addresses can be converted from one memory bank to another in different shift modes in different page address spaces.
- the processor may include a memory 20 having multiple memory banks and reduce memory bank conflicts ⁇ 10 ⁇
- the device 10 includes a bank conflict determination unit 101 configured to determine the binary address of data that will cause bank conflict when parallel access to a memory containing N banks is performed; the shift step determination unit 103 is configured as After the data in the memory is moved in shift steps, the data in which the memory bank conflicts will no longer be in the same memory bank, where the shift step indicates the number of memory banks in which the data in the memory bank is moved.
- the value represented by the n bits selected in the binary address of the data in which N 2 ⁇ n; and the shift unit 105 is configured to move the data in the memory according to the shift step, wherein the data in the memory is stored
- the row address value inside the bank remains unchanged; the device 10 can execute the method shown in FIG. 1 to reduce bank conflicts.
- a logical unit may be a physical unit, or may be a part of a physical unit, or may be multiple
- the combination of physical units is implemented.
- the physical implementation of these logical units is not the most important.
- the combination of functions implemented by these logical units is the key to solving the technical problems proposed by the present invention.
- units that are not closely related to solving the technical problems proposed by the present invention are not introduced in the embodiments of the present invention, which does not mean that there are no other units in the embodiments.
- an electronic device including the processor as described above, such a computing device may be various computing devices, such as laptop computers, desktop computers, workstations, personal digital Assistants, servers, blade servers, mainframes, and other suitable computers; or various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, portable digital assistants (PDAs), portable game consoles, handheld computers, or tablet computers Etc.; or various smart devices, such as various wearable smart devices, smart home appliances, etc.
- various computing devices such as laptop computers, desktop computers, workstations, personal digital Assistants, servers, blade servers, mainframes, and other suitable computers
- mobile devices such as personal digital assistants, cellular phones, smart phones, portable digital assistants (PDAs), portable game consoles, handheld computers, or tablet computers Etc.
- smart devices such as various wearable smart devices, smart home appliances, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Executing Machine-Instructions (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
本发明涉及一种减少存储体冲突的方法,属于存储器技术领域。The invention relates to a method for reducing memory bank conflicts, which belongs to the technical field of memory.
最近,随着对处理器的处理数据的能力的要求的提高,一些处理器已经被设计为并行访问存储器。在存储器具有多个存储体(Bank)的情况下,处理器可以同时对多个存储体并行访问而不会发生冲突。Recently, as the demands on the processor's ability to process data have increased, some processors have been designed to access memory in parallel. In the case where the memory has multiple banks, the processor can simultaneously access multiple banks at the same time without conflict.
然而,在某些情况下,处理器可能需要并行访问存储在同一存储体中的数据,在正常情况下,存储器中数据的地址是固定的,此时便会发生存储体冲突(bank conflict),即,必须逐个访问位于同一存储体中的这些数据。例如,在如图3所示的存储器示例中,每一列代表一个存储体,即该内存中具有Bank 0,…,Bank 7的8个存储体,若访问同一个存储体中的多个地址的数据,例如在Bank 0中点击0,8,16,24,如图3中灰度显示,那么它必须用多个周期来逐个访问Bank 0,导致内存性能降低。However, in some cases, the processor may need to access data stored in the same memory bank in parallel. Under normal circumstances, the address of the data in the memory is fixed, and a bank conflict will occur at this time. That is, these data located in the same memory bank must be accessed one by one. For example, in the memory example shown in FIG. 3, each column represents a memory bank, that is, the memory has 8 memory banks of
发明内容Summary of the invention
本发明的目的在于在处理器的LSU(Load/Store Unit,存取单元)的微架构设计中,提供一种减少存储体冲突方案,以提高存储器性能。The object of the present invention is to provide a solution for reducing memory bank conflicts in the LSU (Load/Store, Unit, Access Unit) micro-architecture design of a processor to improve memory performance.
本发明的第一方面提供了一种减少存储体冲突的方法,包括:The first aspect of the present invention provides a method for reducing memory bank conflicts, including:
确定对包含N个存储体的存储器进行并行访问时会发生存储体冲突的多个数据的二进制地址;Determine the binary addresses of multiple data that will cause bank conflicts when accessing the memory containing N banks in parallel;
确定移位步长,移位步长表示存储体中的数据将被移动的存储体数,是从存储器中的数据的二进制地址中选择的n位所表示的值,其中N=2^n,且N和n均为自然数;并且,存储器中的数据以各自的移位步长移动之后,发生存储体冲突的多个数据将不再位于同一存储体中;以及Determine the shift step size. The shift step size indicates the number of memory banks in which the data in the memory bank will be moved. It is the value represented by n bits selected from the binary address of the data in the memory, where N=2^n, Moreover, both N and n are natural numbers; and after the data in the memory is moved by their respective shift steps, multiple data in which a bank conflict occurs will no longer be in the same bank; and
按照移位步长移动存储器中的数据,其中存储器中的数据在存储体内部的行地址值不变。The data in the memory is moved according to the shift step, wherein the row address value of the data in the memory within the memory bank is unchanged.
本发明通过地址变换,使访问时可能发生冲突的数据不再位于同一存储体中,减少了存储体冲突,提高了存储器性能。Through address conversion, the present invention makes data that may conflict during access no longer in the same storage body, reduces storage body conflicts, and improves memory performance.
进一步,发生存储体冲突的多个数据的移位步长彼此均不相同,使得存储器中的数据以移位步长移动之后,发生存储体冲突的多个数据将分别位于不同的存储体中。通过将所有发生冲突的数据分别移动到不同存储体中,尽可能的消除所有冲突,充分提高存储器的性能。Further, the shift steps of the multiple data in which the bank conflict occurs are different from each other, so that after the data in the memory is moved by the shift step, the multiple data in which the bank conflict occurs will be respectively located in different banks. By moving all the conflicting data to different storage bodies, all conflicts are eliminated as much as possible, and the performance of the memory is fully improved.
进一步,从存储器中的数据的二进制地址中选择的n位可以是连续的n位。可选择地,从存储器中的数据的二进制地址中选择的n位也可以是不连续的n位。更具体地,发生存储体冲突的多个数据的行距可以用2^i+j表示,其中,i和j为自然数,且0≤j<2^i;在j=0时,移位步长可以是从存储器中的数据的二进制地址的第n+i位起向高位方向的连续的n位所表示的值。在j≠0时,移位步长可以是从存储器中的数据的二进制地址的第n+i或n+i+1位起向高位方向的连续n位所表示的值,或存储器中的数据的二进制地址中不连续的n位所表示的值。这些不同移位步长的选择取决于存储体冲突的具体情况以及软件的配置。Further, the n bits selected from the binary address of the data in the memory may be consecutive n bits. Alternatively, the n bits selected from the binary address of the data in the memory may also be discontinuous n bits. More specifically, the line spacing of multiple data in which a bank conflict occurs can be represented by 2^i+j, where i and j are natural numbers, and 0≤j<2^i; when j=0, the shift step size It may be a value represented by consecutive n bits from the n+i bit of the binary address of the data in the memory to the higher bit direction. When j≠0, the shift step size can be the value represented by consecutive n bits from the n+i or n+i+1 bit of the binary address of the data in the memory, or the data in the memory The value represented by the discontinuous n bits in the binary address of. The choice of these different shift steps depends on the specific circumstances of the memory bank conflict and the configuration of the software.
进一步,可以将存储器的存储空间划分为多个页面,提供多组寄存器来配置页面以便为每个页面分别确定移位步长,寄存器的数量小于或等于页面的数量。进一步,在寄存器的数量小于页面的数量时,可以以TLB方式使每个寄存器映射到多个页面。Further, the storage space of the memory can be divided into multiple pages, and multiple sets of registers are provided to configure the pages to determine the shift step size for each page, and the number of registers is less than or equal to the number of pages. Further, when the number of registers is smaller than the number of pages, each register can be mapped to multiple pages in a TLB manner.
本发明的第二方面提供了一种减少存储体冲突的装置,包括:A second aspect of the present invention provides a device for reducing memory bank conflicts, including:
存储体冲突确定单元,被被配置为确定对包含N个存储体的存储器进行并行访问时会发生存储体冲突的多个数据的二进制地址;The bank conflict determination unit is configured to determine a binary address of a plurality of data in which bank conflicts occur when parallel access is made to a memory containing N banks;
移位步长确定单元,被配置为使存储器中的数据以各自的移位步长移动之后,发生存储体冲突的多个数据将不再位于同一存储体中,其中,移位步长表示存储体中的数据将被移动的存储体数,是从存储器中的数据的二进制地址中选择的n位所表示的值,其中N=2^n,且N和n均为自然数;和The shift step determination unit is configured such that after the data in the memory is moved by its respective shift step, multiple data in which a bank conflict occurs will no longer be located in the same bank, where the shift step indicates storage The number of banks in which the data in the volume will be moved is the value represented by n bits selected from the binary address of the data in the memory, where N = 2^n, and both N and n are natural numbers; and
移位单元,被配置为按照移位步长移动存储器中的数据,其中所述存储器中的数据在存储体内部的行地址值不变。The shift unit is configured to shift the data in the memory according to a shift step, wherein the row address value of the data in the memory within the memory bank is unchanged.
进一步,移位步长确定单元进一步被配置为,发生存储体冲突的多个数据的移位步长彼此均不相同,使存储器中的数据以移位步长移动之后,发生存储体冲突的多个数据将分别位于不同的存储体中。Further, the shift step determination unit is further configured that the shift steps of a plurality of data in which a bank conflict occurs are different from each other, so that after the data in the memory is shifted by the shift step, there are many bank conflicts The data will be located in different memory banks.
进一步,从存储器中的数据的二进制地址中选择的n位可以是连续的n位。可选择地, 从存储器中的数据的二进制地址中选择的n位也可以是不连续的n位,且发生存储体冲突的数据的移位步长可以完全不同。Further, the n bits selected from the binary address of the data in the memory may be consecutive n bits. Alternatively, the n bits selected from the binary address of the data in the memory may also be discontinuous n bits, and the shift step size of the data in which the bank conflict occurs may be completely different.
进一步,移位步长确定单元可以进一步被配置为,将存储器的存储空间划分为多个页面,提供多组寄存器来配置页面以便为每个页面分别选择移位步长,寄存器的数量小于或等于页面的数量。进一步,在寄存器的数量小于页面的数量时,以TLB方式使每个寄存器映射到多个页面。Further, the shift step determination unit may be further configured to divide the storage space of the memory into multiple pages, and provide multiple sets of registers to configure pages to select shift steps for each page, and the number of registers is less than or equal to The number of pages. Further, when the number of registers is less than the number of pages, each register is mapped to multiple pages in a TLB manner.
本发明的第三方面提供了一种处理器,该处理器包括具有多个存储体的存储器以及如前述第二方面或第二方面的任一实现方式提供的装置。A third aspect of the present invention provides a processor. The processor includes a memory having a plurality of storage bodies and the device provided by the foregoing second aspect or any implementation manner of the second aspect.
本发明的第四方面提供了一种电子设备,该电子设备可以是各种计算机、服务器或各种移动设备,包括如前述第四方面或第四方面的任一实现方式提供的处理器。A fourth aspect of the present invention provides an electronic device. The electronic device may be various computers, servers, or various mobile devices, including the processor provided in the foregoing fourth aspect or any implementation manner of the fourth aspect.
本发明的第五方面提供了一种计算机程序产品,该计算机程序产品包括程序代码,当该计算机程序产品被控制器执行时,该控制器执行前述第一方面或第一方面的任一实现方式提供的方法。A fifth aspect of the present invention provides a computer program product, the computer program product includes program code, and when the computer program product is executed by a controller, the controller executes the foregoing first aspect or any implementation manner of the first aspect The method provided.
本发明通过简单提取存储在存储器中的数据的地址中的几位来进行移位操作,将可能发生冲突的数据移动到不同存储体中,减少或消除了存储体冲突,提高了存储器性能。The present invention performs a shift operation by simply extracting a few bits in the address of the data stored in the memory, moves the data that may conflict to different memory banks, reduces or eliminates the memory bank conflicts, and improves the memory performance.
图1是根据本发明的实施例的减少存储体冲突的方法的流程图。FIG. 1 is a flowchart of a method for reducing memory bank conflict according to an embodiment of the present invention.
图2是根据本发明的实施例的能够减少存储体冲突的系统的示意图。2 is a schematic diagram of a system capable of reducing memory bank conflicts according to an embodiment of the present invention.
图3是根据本发明的实施例的具有八个存储体的存储器示意图。FIG. 3 is a schematic diagram of a memory having eight memory banks according to an embodiment of the present invention.
图4是根据本发明的实施例的使用地址[5:3]移动存储器中的数据后的存储器示意图。4 is a schematic diagram of a memory after using address [5:3] to move data in a memory according to an embodiment of the present invention.
图5是根据本发明的实施例的使用地址[6:4]移动存储器中的数据后的存储器示意图。5 is a schematic diagram of a memory after using address [6:4] to move data in a memory according to an embodiment of the present invention.
图6是根据本发明的实施例的使用地址的第6、5、3位所表示的值作为移位步长移动存储器中的数据后的存储器示意图。FIG. 6 is a schematic diagram of a memory after using values represented by
图7是根据本发明的实施例的存储器地址重组的硬件实现示意图。7 is a schematic diagram of hardware implementation of memory address reorganization according to an embodiment of the present invention.
下面结合具体实施例和附图对本发明做进一步说明。可以理解的是,此处描述的具体实施例仅仅是为了解释本发明,而非对本发明的限定。此外,为了便于描述,附图中仅示出了与本发明相关的部分而非全部的结构或过程。The present invention will be further described below with reference to specific embodiments and drawings. It can be understood that the specific embodiments described herein are only for explaining the present invention, rather than limiting the present invention. In addition, in order to facilitate description, the drawings only show parts, but not all structures or processes related to the present invention.
诸如动态随机存取存储器(DRAM)或静态随机存取存储器(SRAM)等的存储器设备可以包括多个存储体(Bank),处理器等设备可以对多个存储体进行独立访问。在处理器需要并行访问存储在同一存储体中的数据时,会发生存储体冲突,本发明通过地址变换,将可能发生冲突的数据移动到不同存储体中,从而减少存储体冲突。Memory devices such as dynamic random access memory (DRAM) or static random access memory (SRAM) may include multiple banks, and devices such as processors may independently access multiple banks. When a processor needs to access data stored in the same memory bank in parallel, a memory bank conflict occurs. The present invention uses address conversion to move data that may be in conflict to different memory banks, thereby reducing memory bank conflicts.
根据本发明的一个实施例,提供了一种减少存储体冲突的方法,如图1所示,包括以下步骤:According to an embodiment of the present invention, a method for reducing memory bank conflict is provided. As shown in FIG. 1, the method includes the following steps:
首先,步骤S101,确定对包含多个存储体的存储器进行并行访问时会发生存储体冲突的多个数据的二进制地址。举例来说,如图3中所示,在具有八个存储体Bank 0,Bank 1,…,Bank 7的存储器300中,正常情况下,可以通过8个地址并行访问八个存储体中的数据而不会发生冲突。然而,如果同时点击存储在同一个存储体中的多个数据,例如同时点击Bank 0中的四个地址0,8,16,24,如图3中灰色所示,即会发生存储体冲突,在需要同时访问0,8,16,24处的数据时,0,8,16,24即为会发生存储体冲突的位置。First, in step S101, a binary address of a plurality of data in which a bank conflict occurs when parallel access to a memory including multiple banks is determined. For example, as shown in FIG. 3, in the
随后,步骤S102,确定移位步长,移位步长表示存储体中的数据被移动的存储体数,即数据将被移动几个存储体,它是从存储器中的数据的二进制地址中选择的n位所表示的值,其中N=2^n,且N和n均为自然数;并且,存储器中的数据以移位步长移动之后,发生存储体冲突的数据将不再位于同一存储体中;以及步骤S103,按照移位步长移动存储器中的数据,其中存储器中的数据在存储体内部的行地址值不变。Subsequently, in step S102, a shift step size is determined. The shift step size indicates the number of memory banks in which data in the memory bank is moved, that is, several memory banks in which the data will be moved, which is selected from the binary addresses of the data in the memory The value represented by the n bits of, where N=2^n, and both N and n are natural numbers; and, after the data in the memory is shifted by the shift step, the data in which the bank conflict occurs will no longer be in the same bank Medium; and Step S103, the data in the memory is moved according to the shift step, wherein the row address value of the data in the memory within the memory bank is unchanged.
例如,对于图3示出的存储器300,可以用移位步长将发生冲突的数据0,8,16,24转移,使他们位于不同的存储体。存储器300包括8个存储体(即N=8),对于其中存储的每个数据来说,最多需要移动7个Bank,最少需要移动0个Bank,那么用三位地址即可表示移位步长,因此,可以从存储器中存储的数据的地址中选择3位来表示移位步长(即n=3)。例如,根据本发明的一个实施例,可以使用从地址的第3位起向高位方向的连续3位所表示的值作为移位步长,也就是取地址[5:3]来移动存储器中的数据,即,取存储器中存储的数据的地址中的第5位到第3位(在地址表示中,通常将最右侧的最低位作为第0位,向左依次递增,以下地址的位数均按此方式表示),按照这三位的值来移动存储器中的数据。那么,在存储器300中,第一行数据0-7的移位步长均为000,即第一行数据不移动;第二行数据8-15的移位步长均为001,即,移动一位;同理,可得剩余其他数据的移位步长。在本实施例中,发送存储体冲突的数据被移动到了不同存储体中,但是数据在存储体内部的行地址值并没有发生变化,即从附图表示的来看,数据都是在同一行中移动,因为移位步长只改变了存储体的解码,而并不会改变存储体内部的行地址的解码。For example, for the
按照上述方式移位后,可以得到图4所示的结果,存储器中的数据0,8,16,24被分别移动到了不同的存储体Bank 0、Bank 1、Bank 2、Bank 3中,如此,处理器便可以在一个周期中并行地访问0,8,16,24。在图4中,仅示意性的示出了发生存储体冲突的数据所在的行,其余行被省略。After shifting in the above manner, the results shown in Figure 4 can be obtained. The
从图3和图4的对比中可以看出,通过上述的方式移动数据后,存储体冲突被消除,0,8,16,24处的数据可以在同一个周期中被并行地访问,提高了存储器性能。As can be seen from the comparison between Figure 3 and Figure 4, after moving the data in the above manner, the memory bank conflict is eliminated, and the data at 0, 8, 16, 24 can be accessed in parallel in the same cycle, which improves Memory performance.
需要注意的是,以上移位步长的选择是可以根据不同的数据访问需求而灵活配置的,例如,在另一实施例中,对于同样类似于图3中的存储器300的具有8个存储体的存储器,假设Bank 0中的0,16,32,48处的数据发生冲突,为了将以上数据移动到不同存储体,可以使用从地址的第4位起向高位方向的连续3位所表示的值作为移位步长,也就是取地址[6:4]所表示的值作为移位步长来移动存储器中的数据,0,16,32,48处的数据的移位步长分别为000、001、010、011,从而0,16,32,48处的数据将分别移动到存储体Bank 0、Bank 1、Bank 2、Bank 3中,移位后的结果如图5所示。It should be noted that the above shift step selection can be flexibly configured according to different data access requirements. For example, in another embodiment, there are 8 banks for the
此外,在以上两个示例中,所有的冲突数据都被分别移动到了不同的存储体中,但是,这只是本发明的一种较优实施方式,在一些情况下,也可以不把所有的冲突数据都分别移动到不同存储体中,而只是将部分冲突数据移动到不同存储体中,消除部分冲突。例如,在图3所示的存储器中,假设需要同时访问0,8,16,24处的数据,除了上述的取地址[5:3]来移动存储器中的数据以外,也可以取地址[6:4]来移动存储器中的数据,即四个发生冲突的数据的移位步长分别为000、000、001、001,移动后得到的结果与图5相同,此时,发生存储体冲突的0,8,16,24处的数据被分别移动到了Bank 0、Bank 0、Bank 1、Bank 1中,使得对这样四个数据的访问从四个周期缩减为两个周期,同样减少了存储体冲突,在一定程度上提高了存储器性能。In addition, in the above two examples, all the conflicting data have been moved to different memory banks, but this is only a preferred embodiment of the present invention. In some cases, all conflicts may not be removed. The data is moved to different memory banks separately, but only part of the conflicted data is moved to different memory banks to eliminate part of the conflict. For example, in the memory shown in Figure 3, it is assumed that the data at 0, 8, 16, 24 needs to be accessed at the same time. In addition to the above address [5:3] to move the data in the memory, the address [6 : 4] to move the data in the memory, that is, the shift steps of the four conflicting data are 000, 000, 001, and 001. The result after the movement is the same as that in FIG. 5. At this time, the memory conflict The data at 0, 8, 16, 24 were moved to
上面的实施例示出了选择存储在存储器中的数据的地址中的连续的几位来表示移位步长的情况。除此之外,也可以选择地址中不连续的几位来作为移位步长。The above embodiment shows a case where several consecutive bits in the address of the data stored in the memory are selected to represent the shift step size. In addition, it is also possible to select a few consecutive bits in the address as the shift step.
例如,在图3所示的具有8个存储体的存储器的实例中,在一个实施例中,若发生冲突的数据是Bank 0中的8,16,32处的数据,为了将以上数据转移到不同存储体中,可以选择地址的第6、5、3位组成的数据作为移位步长,分别将数据移动到存储体Bank 1、Bank 0、Bank 2中,得到如图6所示的结果,图6仅示意性的示出了部分行,其余行被省略。同样,在一些实施方式中,也可以选择地址的第5、4、2位组成的数据作为移位步长,或者可以选择6、3、1位组成的数据来作为移位步长等等,这些不同的选择取决于不同的数 据访问需求带来的不同存储体冲突或者不同的软件配置。For example, in the example of the memory with 8 banks shown in FIG. 3, in one embodiment, if the conflicting data is the data at 8, 16, 32 in
尽管对于移位步长的不同的选择取决于不同的数据访问需求带来的不同存储体冲突或者不同的软件配置。但是,在选择时也可以有一定的规律可循。例如,在发生存储体冲突的多个数据是等间距的情况下,冲突数据的行距(本发明中,行距表示两个数据的行地址值之间的差)可以用2^i+j表示,其中,i和j为自然数,且0≤j<2^i,此时,在不同情况下,可以按照下述方式选择移位步长。Although the different choices for the shift step size depend on different memory conflicts caused by different data access requirements or different software configurations. However, there are certain rules to follow when choosing. For example, in the case where a plurality of data in which a bank conflict occurs are equidistant, the line spacing of the conflicting data (in the present invention, the line spacing represents the difference between the line address values of the two data) can be expressed as 2^i+j, Among them, i and j are natural numbers, and 0≤j<2^i. At this time, in different cases, the shift step size can be selected in the following manner.
在j=0时,移位步长可以选择为从存储器中的数据的二进制地址的第n+i位起向高位方向的连续的n位所表示的值。例如,在图3所示的存储体冲突示例中,0,8,16,24处的数据的行距为1,即,i=0,j=0,那么选择第n+i=3起的连续3(n=3)位,即地址[5:3]来移动存储器中的数据,可以有效消除存储体冲突;再如,图5所示的将冲突数据0,16,32,48分别移动到了不同的存储体中的实例,冲突数据的行距为2,即,i=1,j=0,那么可以选择第n+i=4起的连续3(n=3)位,即地址[6:4]来移动存储器中的数据,也可以有效消除存储体冲突。而在j≠0时,移位步长可以是从存储器中的数据的二进制地址的第n+i或n+i+1位起向高位方向的连续n位所表示的值,或存储器中的数据的二进制地址中不连续的n位所表示的值。例如,同样利用图3所示的具有8个存储体的存储器的实例,假设发生存储体冲突的数据为0,24,48,则行距为3,即,i=1,j=1,那么可以选择第n+i=4起的连续3(n=3)位,即地址[6:4]来移动存储器中的数据,此时,0,24,48处的数据的移位步长分别为000、001、011,数据将被分别移动到存储体Bank 0、Bank 1、Bank 3中,可以有效消除存储体冲突;也可以选择第n+i+1=5起的连续3(n=3)位,即地址[7:5]来移动存储器中的数据,此时,0,24,48处的数据的移位步长分别为000、000、001,数据被分别移动到存储体Bank 0、Bank 0、Bank 1中,有效地减少了存储体冲突;也可以选择不连续的几位,只要能把发生存储体冲突的数据从同一存储体中移开即可。When j=0, the shift step size can be selected as the value represented by consecutive n bits from the n+i bit of the binary address of the data in the memory to the higher bit direction. For example, in the memory bank conflict example shown in FIG. 3, the line spacing of the data at 0, 8, 16, 24 is 1, that is, i=0, j=0, then select the consecutive n+i=3 consecutive 3 (n=3) bits, that is, address [5:3] to move the data in the memory, which can effectively eliminate the memory bank conflict; again, as shown in FIG. 5, the
需要注意的是,以上关于冲突数据的行距以2^i+j表示时的移位步长的选择方式仅仅是举例说明,旨在说明本发明的一种可能的实施方式,并不构成对本发明的限制,对于本领域技术人员来说,可以依据本发明的思想采用各种方式选择地址中的n位来作为移位步长。It should be noted that the above selection method of the shift step when the line spacing of conflicting data is represented by 2^i+j is only an example, and is intended to illustrate a possible implementation of the present invention, and does not constitute a limitation on the present invention. For those skilled in the art, according to the idea of the present invention, n bits in the address can be selected as a shift step in various ways.
同时,虽然在图3-5中示出了包括8个存储体的存储器的示例,但本领域技术人员应当理解,这里示出的存储体的数量以及存储器的结构仅仅是为了方便说明,而并不构成对本发明的限制。本发明同样可以应用于具有4、16、32等等不同数量的存储体的存储器中,在具有不同数量的存储体的存储器中,可以选择不同位数的地址来作为移位数据,例如, 在具有32个存储体的存储器中,发生存储体冲突时,可以选取地址中的5位作为移位数据,用选取的5位表示的值来将发生冲突的数据移动到不同存储体中,从而减少存储体冲突。Meanwhile, although an example of a memory including 8 memory banks is shown in FIGS. 3-5, those skilled in the art should understand that the number of memory banks and the structure of the memory shown here are only for convenience of explanation, and It does not constitute a limitation on the present invention. The present invention can also be applied to memories with different numbers of banks of 4, 16, 32, etc. In memories with different numbers of banks, addresses with different numbers of bits can be selected as shift data, for example, in In a memory with 32 banks, when a bank conflict occurs, 5 bits of the address can be selected as shift data, and the value represented by the selected 5 bits can be used to move the conflicted data to different banks, thereby reducing Memory conflict.
下面结合图7来说明根据本发明的实施例的存储器地址重组的硬件实现示例。The following describes a hardware implementation example of memory address reorganization according to an embodiment of the present invention with reference to FIG. 7.
根据本发明的一个实施例,假设存储器被分成N个存储体,一般来说,N通常取2的指数,例如2,4,8,16,32等,即N=2^n,以避免资源浪费,但在一些极端情况下,也可以取其他值。According to an embodiment of the present invention, it is assumed that the memory is divided into N memory banks. In general, N usually takes an exponent of 2, such as 2, 4, 8, 16, 32, etc., that is, N = 2^n to avoid resources Wasted, but in some extreme cases, other values can also be taken.
通常,可以从存储器地址中选择n位来解码存储体选择。例如,如图7所示,可以将此信号命名为bank_sel(即,bank_sel是从地址中选择的n位)。在没有本发明的情况下,Bank_sel的值可以直接选择要访问的存储体,例如,bank_sel=0表示Bank 0,bank_sel=1表示Bank 1等。Generally, n bits can be selected from the memory address to decode the bank selection. For example, as shown in FIG. 7, this signal can be named bank_sel (ie, bank_sel is the n-bit selected from the address). Without the present invention, the value of Bank_sel can directly select the memory bank to be accessed, for example, bank_sel=0 means
在根据本发明的实施例中,可以向bank_sel添加移位步长,移位步长可以用shift_sel表示,如图7所示,对于具有N个存储体的这样的存储器的每一行,都会有N种可能的移位情况(0,1,2,…,N-1),即,shift_sel的值可以是0...N-1。那么,存储体的选择便由bank_sel和shift_sel共同决定。例如,bank_sel+shift_sel=0表示Bank 0,bank_sel+shift_sel=1表示Bank 1等。In an embodiment according to the present invention, a shift step can be added to bank_sel, which can be represented by shift_sel. As shown in FIG. 7, for each row of such a memory with N banks, there will be N A possible shift situation (0, 1, 2, ..., N-1), that is, the value of shift_sel may be 0...N-1. Then, the choice of memory bank is determined jointly by bank_sel and shift_sel. For example, bank_sel+shift_sel=0 means
由此,如图7所示,可以定义一个5位域的地址,包括存储体选择(bank_sel)、移位步长(shift_sel)、字节偏移(offset)以及2位域的行索引(index_h和index_l),用以生成解码信号。其中,存储体选择(bank_sel)和移位步长(shift_sel)用于解码存储体选择,如上所述;行索引(index_h和index_l)用于索引到存储体的具体行,即用于索引数据在存储体内部的行地址;字节偏移(offset)用于索引一个存储体内的偏移量,例如,如果存储体宽度为4字节,则字节偏移(offset)应为2位。各位域的位置可以通过软件来配置,例如,可以通过软件来配置shift_sel和bank_sel是否相邻,如果不相邻,例如如图7中所示情况,此时,index_l的宽度也可以是可配置的。Thus, as shown in FIG. 7, a 5-bit field address can be defined, including bank selection (bank_sel), shift step (shift_sel), byte offset (offset), and 2-bit field index (index_h And index_l) to generate decoded signals. Among them, bank selection (bank_sel) and shift step (shift_sel) are used to decode bank selection, as described above; row indexes (index_h and index_l) are used to index specific rows of the bank, that is, used to index data in The row address inside the memory bank; the byte offset (offset) is used to index the offset within a memory bank. For example, if the memory bank width is 4 bytes, the byte offset (offset) should be 2 bits. The position of each field can be configured by software. For example, whether shift_sel and bank_sel are adjacent can be configured by software. If they are not adjacent, as shown in FIG. 7, for example, the width of index_l can also be configurable .
硬件配置好后,关键是如何获得移位步长shift_sel。After the hardware is configured, the key is how to obtain the shift step shift_sel.
根据本发明的实施例,一种简单的方法是根据数据访问需求选择地址中的另外的n位作为shift_sel。bank_sel位和shift_sel位都可以是地址中的任意n位。例如,它们可以是地址中相同的n位,或者有一些位重叠。软件配置可以根据应用场景将地址中的n位选为shift_sel。当然,如果不想移动,也可以配置一个意味着不移动的值。如前所述,它们可以是地址中连续的n位,也可以是不连续的n位。According to an embodiment of the present invention, a simple method is to select another n bits in the address as shift_sel according to data access requirements. Both the bank_sel bit and the shift_sel bit can be any n bits in the address. For example, they can be the same n bits in the address, or some bits overlap. The software configuration can select the n-bit in the address as shift_sel according to the application scenario. Of course, if you don't want to move, you can also configure a value that means no movement. As mentioned earlier, they can be consecutive n bits in the address or discontinuous n bits.
另一种更复杂但也更灵活的方法是,在不同的地址空间选择不同的位作为shift_sel。Another more complex but more flexible method is to select different bits as shift_sel in different address spaces.
例如,在一些实施例中,存储器的存储空间可以被划分为多个页面(page),例如划分为Page 1,Page 2,Page 3……每个页面可以都有n KB(或其他)的空间,可以提供多组寄存器来配置这些页面,从而为每个页面分别定义不同的移位方式。例如,可以在Page 1中选择使用地址[5:3]来移动存储器中的数据,在Page 2中选择使用地址[6:4]来移动存储器中的数据,在Page 3中选择使用地址的第5、4、2位来移动存储器中的数据等等。For example, in some embodiments, the storage space of the memory may be divided into multiple pages (page), for example, divided into
如果可以的话,可以分别为每个页面配一组寄存器。但在一些情况下,寄存器组的数量可以小于地址页面的数量,也就是说,可以将一组寄存器组配置为以TLB方式映射到不同的页面,从而节省寄存器。这种方式实际上是一个特殊的TLB(Translation Lookaside Buffer,转译旁路缓冲)。传统上,TLB一般用于将虚拟地址转换为物理地址,但是在这里,可以提供TLB以在不同的页面采取不同的移位方式将存储器中的数据从一个存储体转移到另一个存储体中。If possible, a set of registers can be allocated for each page. However, in some cases, the number of register groups may be smaller than the number of address pages, that is, a group of register groups may be configured to be mapped to different pages in a TLB manner, thereby saving registers. This method is actually a special TLB (Translation Lookaside Buffer, translation bypass buffer). Traditionally, TLB is generally used to convert a virtual address to a physical address, but here, a TLB can be provided to transfer the data in the memory from one memory bank to another memory bank in different shift modes on different pages.
在被划分为多个页面的存储器空间中,每页的大小可以是2的指数,即,M=2^m。那么,一页内的地址可以用m位表示。假设整个地址有k位,则其他k-m位地址可以构成页面地址。可以定义多组配置寄存器,并将这多组配置寄存器作为TLB。当访问存储器时,将所访问的地址的页面地址与所有配置寄存器组中的页面地址进行比较,如果找到一个相同的值,那么可使用该组配置来选择shift_sel;而如果未找到相同的值,那么硬件将向软件请求中断,以要求软件填充或替换TLB,或者使用第二TLB中的软件预配置值自动替换TLB条目。软件必须确保不会出现两个或更多寄存器组配置对应到同一页面。这样,用较少数量的寄存器组即可配置大量的页面地址空间,实现在不同的页面地址空间中以不同的移位方式将地址从一个存储体转换到另一个存储体。In the memory space divided into a plurality of pages, the size of each page may be an index of 2, that is, M=2^m. Then, the address in one page can be represented by m bits. Assuming that the entire address has k bits, other k-m bit addresses may constitute page addresses. You can define multiple sets of configuration registers, and use these multiple sets of configuration registers as TLB. When accessing the memory, the page address of the accessed address is compared with the page addresses in all configuration register groups. If an identical value is found, then the group configuration can be used to select shift_sel; and if the same value is not found, Then the hardware will request an interrupt from the software to request the software to fill or replace the TLB, or automatically replace the TLB entry with the software pre-configured value in the second TLB. The software must ensure that no two or more register bank configurations correspond to the same page. In this way, a large number of page address spaces can be configured with a relatively small number of register groups, and the addresses can be converted from one memory bank to another in different shift modes in different page address spaces.
以上硬件实现仅仅作为示例说明本发明的思想,其中的具体配置并不构成对本发明的限制,本发明可以以各种合适的方式实施。The above hardware implementation is only used as an example to illustrate the idea of the present invention, and the specific configuration therein does not constitute a limitation to the present invention, and the present invention can be implemented in various suitable ways.
根据本发明的另一个实施例,还提供了一种系统,该系统可以在处理器中实现,如图2所示,即,处理器可以包括具有多个存储体的存储器20以及减少存储体冲突的装置10。其中,装置10包括存储体冲突确定单元101,被配置为确定对包含N个存储体的存储器进行并行访问时会发生存储体冲突的数据的二进制地址;移位步长确定单元103,被配置为使存储器中的数据以移位步长移动之后,发生存储体冲突的数据将不再位于同一存储体中,其中,移位步长表示存储体中的数据被移动的存储体数,是从存储器中的数据的二进制地址中选择的n位所表示的值,其中N=2^n;和移位单元105,被配置为按照移位步长移动 存储器中的数据,其中存储器中的数据在存储体内部的行地址值不变;装置10可以执行如图1所示的减少存储体冲突的方法。According to another embodiment of the present invention, there is also provided a system, which can be implemented in a processor, as shown in FIG. 2, that is, the processor may include a
在此,特别说明,本发明各种实施例中提到的各单元都是逻辑单元,在物理上,一个逻辑单元可以是一个物理单元,也可以是一个物理单元的一部分,还可以以多个物理单元的组合实现,这些逻辑单元本身的物理实现方式并不是最重要的,这些逻辑单元所实现的功能的组合才是解决本发明所提出的技术问题的关键。此外,为了突出本发明的创新部分,本发明各实施例中并没有将与解决本发明所提出的技术问题关系不太密切的单元引入,这并不表明各个实施例中不存在其它的单元。Here, it is particularly noted that the units mentioned in various embodiments of the present invention are logical units. Physically, a logical unit may be a physical unit, or may be a part of a physical unit, or may be multiple The combination of physical units is implemented. The physical implementation of these logical units is not the most important. The combination of functions implemented by these logical units is the key to solving the technical problems proposed by the present invention. In addition, in order to highlight the innovative part of the present invention, units that are not closely related to solving the technical problems proposed by the present invention are not introduced in the embodiments of the present invention, which does not mean that there are no other units in the embodiments.
根据本发明的另一个实施例,还提供了一种电子设备,包括如前所述的处理器,这种计算设备可以是各种计算设备,例如膝上型计算机、台式机、工作站、个人数字助理、服务器、刀片服务器、大型机和其他适当的计算机;或者各种形式的移动设备,诸如个人数字助理、蜂窝电话、智能电话、便携式数字助理(PDA)、便携式游戏机、掌上电脑或平板电脑等;或者各种智能设备,诸如各种可穿戴智能设备、智能家电等。According to another embodiment of the present invention, there is also provided an electronic device, including the processor as described above, such a computing device may be various computing devices, such as laptop computers, desktop computers, workstations, personal digital Assistants, servers, blade servers, mainframes, and other suitable computers; or various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, portable digital assistants (PDAs), portable game consoles, handheld computers, or tablet computers Etc.; or various smart devices, such as various wearable smart devices, smart home appliances, etc.
上面结合附图对本发明的实施例做了详细说明,但本发明技术方案的使用不仅仅局限于本专利实施例中提及的各种应用,各种结构和变型都可以参考本发明技术方案轻易地实施,以达到本文中提及的各种有益效果。在本领域普通技术人员所具备的知识范围内,在不脱离本发明宗旨的前提下做出的各种变化,均应归属于本发明专利涵盖范围。The embodiments of the present invention have been described in detail above in conjunction with the drawings, but the use of the technical solutions of the present invention is not limited to the various applications mentioned in the embodiments of this patent, and various structures and modifications can be easily referred to the technical solutions of the present invention Implemented in order to achieve the various beneficial effects mentioned in this article. Within the scope of knowledge possessed by those of ordinary skill in the art, various changes made without departing from the gist of the present invention shall fall within the scope of the patent of the present invention.
Claims (14)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811580102.X | 2018-12-24 | ||
CN201811580102.XA CN109710309B (en) | 2018-12-24 | 2018-12-24 | Method for reducing memory bank conflict |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020135209A1 true WO2020135209A1 (en) | 2020-07-02 |
Family
ID=66256110
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/126552 WO2020135209A1 (en) | 2018-12-24 | 2019-12-19 | Method for reducing bank conflicts |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109710309B (en) |
WO (1) | WO2020135209A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109710309B (en) * | 2018-12-24 | 2021-01-26 | 安谋科技(中国)有限公司 | Method for reducing memory bank conflict |
CN111857831B (en) * | 2020-06-11 | 2021-07-20 | 成都海光微电子技术有限公司 | Memory bank conflict optimization method, parallel processor and electronic equipment |
CN114827091B (en) * | 2022-04-25 | 2023-06-20 | 珠海格力电器股份有限公司 | Physical address conflict processing method and device and communication equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101341473A (en) * | 2005-12-20 | 2009-01-07 | Nxp股份有限公司 | Multi-processor circuit with shared memory banks |
CN106133709A (en) * | 2014-02-27 | 2016-11-16 | 三星电子株式会社 | For the method and apparatus preventing the bank conflict in memorizer |
US10061541B1 (en) * | 2017-08-14 | 2018-08-28 | Micron Technology, Inc. | Systems and methods for refreshing a memory bank while accessing another memory bank using a shared address path |
CN109710309A (en) * | 2018-12-24 | 2019-05-03 | 安谋科技(中国)有限公司 | Methods to reduce bank conflicts |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7797493B2 (en) * | 2005-02-15 | 2010-09-14 | Koninklijke Philips Electronics N.V. | Enhancing performance of a memory unit of a data processing device by separating reading and fetching functionalities |
CN101082906A (en) * | 2006-05-31 | 2007-12-05 | 中国科学院微电子研究所 | Fixed base FFT processor with low memory overhead and method thereof |
CN102184092A (en) * | 2011-05-04 | 2011-09-14 | 西安电子科技大学 | Special instruction set processor based on pipeline structure |
CN105701036B (en) * | 2016-01-19 | 2019-03-05 | 中国人民解放军国防科学技术大学 | A kind of address conversioning unit for supporting the deformation parallel memory access of base 16FFT algorithm |
US10198369B2 (en) * | 2017-03-24 | 2019-02-05 | Advanced Micro Devices, Inc. | Dynamic memory remapping to reduce row-buffer conflicts |
CN107748723B (en) * | 2017-09-28 | 2020-03-20 | 中国人民解放军国防科技大学 | Storage method and access device supporting conflict-free stepping block-by-block access |
-
2018
- 2018-12-24 CN CN201811580102.XA patent/CN109710309B/en active Active
-
2019
- 2019-12-19 WO PCT/CN2019/126552 patent/WO2020135209A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101341473A (en) * | 2005-12-20 | 2009-01-07 | Nxp股份有限公司 | Multi-processor circuit with shared memory banks |
CN106133709A (en) * | 2014-02-27 | 2016-11-16 | 三星电子株式会社 | For the method and apparatus preventing the bank conflict in memorizer |
US10061541B1 (en) * | 2017-08-14 | 2018-08-28 | Micron Technology, Inc. | Systems and methods for refreshing a memory bank while accessing another memory bank using a shared address path |
CN109710309A (en) * | 2018-12-24 | 2019-05-03 | 安谋科技(中国)有限公司 | Methods to reduce bank conflicts |
Also Published As
Publication number | Publication date |
---|---|
CN109710309B (en) | 2021-01-26 |
CN109710309A (en) | 2019-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9384145B2 (en) | Systems and methods for implementing dynamically configurable perfect hash tables | |
US6826663B2 (en) | Coded write masking | |
US9158683B2 (en) | Multiport memory emulation using single-port memory devices | |
JP2625277B2 (en) | Memory access device | |
CN108733415B (en) | Method and device for supporting vector random access | |
JP6880149B2 (en) | Static Random Access Methods, Devices, Equipment and Storage Media for Accessing Memory | |
WO2020135209A1 (en) | Method for reducing bank conflicts | |
CN115422098B (en) | GPU memory access adaptive optimization method and device based on extended page table | |
KR102580484B1 (en) | Dynamic metadata relocation in memory | |
CN114942831A (en) | Processor, chip, electronic device and data processing method | |
CN113656330B (en) | Method and device for determining access address | |
CN115827211A (en) | Near-memory computing accelerator, dual in-line memory module, and computing device | |
CN108139989B (en) | Computer device equipped with processing in memory and narrow access port | |
US6684267B2 (en) | Direct memory access controller, and direct memory access control method | |
US20250094092A1 (en) | Memory device for performing in-memory processing | |
JP3935871B2 (en) | MEMORY SYSTEM FOR COMPUTER CIRCUIT HAVING PIPELINE AND METHOD FOR PROVIDING DATA TO PIPELINE FUNCTIONAL UNIT | |
CN108647289B (en) | Hash table building method based on valley Hash and bloom filter | |
CN110018847B (en) | Configurable register and data access method based on same | |
US8230196B1 (en) | Configurable partitions for non-volatile memory | |
Lee et al. | A low-power VLSI architecture for a shared-memory FFT processor with a mixed-radix algorithm and a simple memory control scheme | |
US8812813B2 (en) | Storage apparatus and data access method thereof for reducing utilized storage space | |
JP3417473B2 (en) | Memory access system | |
US20090182938A1 (en) | Content addressable memory augmented memory | |
US20130321439A1 (en) | Method and apparatus for accessing video data for efficient data transfer and memory cache performance | |
CN116841922A (en) | TLB page table entry management method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19904522 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 031121) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19904522 Country of ref document: EP Kind code of ref document: A1 |