CN108804508A - Method and system for storing input image - Google Patents
Method and system for storing input image Download PDFInfo
- Publication number
- CN108804508A CN108804508A CN201810344898.2A CN201810344898A CN108804508A CN 108804508 A CN108804508 A CN 108804508A CN 201810344898 A CN201810344898 A CN 201810344898A CN 108804508 A CN108804508 A CN 108804508A
- Authority
- CN
- China
- Prior art keywords
- memory
- size
- bytes
- storing
- input image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/42—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
- H04N19/423—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
本发明提供一种存储输入图像的方法及系统。本发明描述了在存储器中分配一个或多个帧缓存器。本发明还描述了将输入图像分割成对应于输入图像的多个子集的多个访问单元,并在帧缓存器中给多个访问单元中的每个访问单元分配主部分和副部分,其中至少一个副部分在帧缓存器中不顺序地位于其各自的主部分之后。本发明也描述了将访问单元压缩成已压缩访问单元,将每个已压缩访问单元存储到各自的主部分中,并且如果已压缩访问单元的尺寸超过主部分的尺寸,则将已压缩访问单元的剩余部分存储到各自的副部分中。本发明使得存储器中所存储的已压缩访问单元能被有效地完成访问。
The present invention provides a method and system for storing an input image. The present invention describes allocating one or more frame buffers in a memory. The present invention also describes segmenting the input image into multiple access units corresponding to multiple subsets of the input image, and allocating a main part and a sub-part to each of the multiple access units in the frame buffer, wherein at least one sub-part is non-sequentially located after its respective main part in the frame buffer. The present invention also describes compressing the access units into compressed access units, storing each compressed access unit in a respective main part, and if the size of the compressed access unit exceeds the size of the main part, storing the remaining part of the compressed access unit in a respective sub-part. The present invention enables the compressed access units stored in the memory to be efficiently accessed.
Description
优先权声明priority statement
本申请要求如下申请的优先权:在2017年04月25日提出名称为“Memory AccessEfficiency Optimization for Frame Buffer Compression”的第62/489,588号的美国临时专利申请和在2017年10月18日提出的名称为“Distributed Access Unit for FrameBuffer Compression”的第15/786,908号的美国临时专利申请,其整体以引用方式并入本文中。This application claims priority to U.S. Provisional Patent Application No. 62/489,588, filed April 25, 2017, entitled "Memory AccessEfficiency Optimization for Frame Buffer Compression," and filed October 18, 2017 US Provisional Patent Application No. 15/786,908 for "Distributed Access Unit for FrameBuffer Compression," which is hereby incorporated by reference in its entirety.
技术领域technical field
本发明的所公开实施例涉及存储技术,且更具体而言,涉及一种存储输入图像的方法及系统。The disclosed embodiments of the invention relate to storage techniques, and more particularly, to a method and system of storing input images.
背景技术Background technique
此处提供的背景技术描述用作一般展现本发明的内容的目的。目前署名发明人的工作内容,既包含在本背景技术部分中所描述的工作的内容,也包含在申请时未被认为是现有技术的各方面,这些既不明确也不暗示地认为是本发明的现有技术。The background description provided herein is for the purpose of generally presenting the context of the disclosure. The content of the work of the presently named inventors, including both the content of the work described in this Background section, and aspects that were not considered to be prior art at the time of filing, are neither expressly nor impliedly considered to be part of the present invention. Invention prior art.
电子设备,例如计算机系统,可以包括一个或多个存储器。在一个示例中,电子设备包括一组件,例如位于与存储器不同的集成电路芯片上的中央处理单元(centralprocessing unit,CPU),其通过存储器控制器访问存储器。由CPU访问的存储器会在CPU与存储器之间产生繁重的数据流量。Electronic devices, such as computer systems, may include one or more memories. In one example, an electronic device includes a component, such as a central processing unit (CPU), on a separate integrated circuit chip than the memory, which accesses the memory through a memory controller. Memory accessed by the CPU creates heavy data traffic between the CPU and the memory.
发明内容Contents of the invention
有鉴于此,本发明提供了一种存储输入图像的方法及系统,以有效地完成访问存储器中所存储的已压缩访问单元。In view of this, the present invention provides a method and system for storing an input image, so as to efficiently access the compressed access units stored in the memory.
本发明的方面提供一种存储输入图像到存储器中的方法。本方法可以包括:在存储器中分配一个或多个帧缓存器;将输入图像分割成对应于输入图像的多个子集的多个访问单元,并在帧缓存器中给多个访问单元中的每个访问单元分配主部分和副部分,其中至少一个副部分在帧缓存器中不顺序地位于其各自的主部分之后;将多个访问单元压缩成多个已压缩访问单元;以及将每个已压缩访问单元中存储到各自的主部分中,并且如果已压缩访问单元的尺寸超过主部分的尺寸,则将已压缩访问单元的剩余部分存储到各自的副部分中。Aspects of the invention provide a method of storing an input image in a memory. The method may include: allocating one or more frame buffers in memory; dividing the input image into a plurality of access units corresponding to a plurality of subsets of the input image, and assigning each of the plurality of access units in the frame buffer allocating a main part and a sub part for access units, wherein at least one sub part is not sequentially located after its respective main part in the frame buffer; compressing the plurality of access units into a plurality of compressed access units; and compressing each of the The compressed access units are stored into their respective main sections, and if the size of the compressed access units exceeds the size of the main section, the remainder of the compressed access units are stored into their respective secondary sections.
本发明的方面也提供一种用于存储输入图像的系统。本系统包括存储器、存储器分配装置和存储器控制器。存储器具有一个或多个帧缓存器;存储器分配装置用于接收输入图像,在存储器中分配帧缓存器以存储输入图像,将输入图像分割成多个对应于输入图像的多个子集的多个访问单元,并在帧缓存器中给每个访问单元分配主部分和副部分,其中至少一个副部分在帧缓存器中不顺序地位于其各自的主部分之后;以及存储器控制器用于响应于存储器分配装置的多个指令,将每个已压缩访问单元存储到各自的主部分中,并且如果已压缩访问单元的尺寸超过主部分的尺寸,则将已压缩访问单元的剩余部分存储到各自的副部分中。Aspects of the invention also provide a system for storing input images. The system includes memory, memory allocation device and memory controller. The memory has one or more frame buffers; the memory allocating means for receiving the input image, allocating the frame buffer in the memory to store the input image, dividing the input image into a plurality of accesses corresponding to the plurality of subsets of the input image unit, and allocates a main part and a sub part to each access unit in the frame buffer, wherein at least one sub part is located non-sequentially after its respective main part in the frame buffer; and a memory controller for responding to the memory allocation Instructions for means to store each compressed access unit into its respective main section, and if the size of the compressed access unit exceeds the size of the main section, store the remainder of the compressed access unit into its respective secondary section middle.
本发明的可选方面可以提供一种非暂态计算机可读介质,存储有计算机可读指令,在处理电路执行计算机可读指令时,处理电路执行一方法,该方法包括:在存储器中分配一个或多个帧缓存器;An optional aspect of the present invention may provide a non-transitory computer-readable medium storing computer-readable instructions. When the processing circuit executes the computer-readable instructions, the processing circuit performs a method comprising: allocating a or multiple frame buffers;
将输入图像分割成对应于输入图像的多个子集的多个访问单元,并在帧缓存器中给多个访问单元中的每个访问单元分配主部分和副部分,其中至少一个副部分在帧缓存器中不顺序地位于其各自的主部分之后;将多个访问单元压缩成多个已压缩访问单元;将每个已压缩访问单元存储到各自的主部分中,并且如果已压缩访问单元的尺寸超过主部分的尺寸,则将已压缩访问单元的剩余部分存储到各自的副部分中。dividing an input image into a plurality of access units corresponding to a plurality of subsets of the input image, and assigning a main part and a sub part to each access unit in the plurality of access units in a frame buffer, wherein at least one sub part is in the frame non-sequentially located in the cache after its respective main part; compress multiple access units into multiple compressed access units; store each compressed access unit into its respective main part, and if the compressed access unit's size exceeds the size of the main part, the remainder of the compressed access unit is stored in the respective secondary part.
本发明通过给每个访问单元在存储器中分配主部分和副部分,在访问单元被压缩之后,将已压缩访问单元存储到主部分,在主部分的尺寸小于已压缩访问单元的情况下,还将已压缩访问单元的剩余部分存储到副部分,使得已压缩访问单元被有效地完成访问。The present invention allocates a main part and a sub part in memory for each access unit, after the access unit is compressed, stores the compressed access unit into the main part, and also stores the compressed access unit in the main part when the size of the main part is smaller than the compressed access unit. Storing the remainder of the compressed access unit to the secondary section allows the compressed access unit to be efficiently accessed.
附图说明Description of drawings
将结合下面的附图对被提供作为示例的本发明的各种实施例进行详细描述,其中相同的符号表示相同的元件,以及其中:Various embodiments of the invention, provided as examples, will be described in detail with reference to the following drawings, in which like symbols refer to like elements, and in which:
图1是根据本发明实施例的存储器系统的示例性方框图;1 is an exemplary block diagram of a memory system according to an embodiment of the present invention;
图2是根据本发明实施例的示例性数据结构;Fig. 2 is an exemplary data structure according to an embodiment of the present invention;
图3是根据本发明实施例的三个帧缓存器中三个示例性超级块;FIG. 3 is three exemplary super blocks in three frame buffers according to an embodiment of the present invention;
图4是根据本发明实施例的三个帧缓存器中三个示例性超级块;FIG. 4 is three exemplary super blocks in three frame buffers according to an embodiment of the present invention;
图5是根据本发明实施例的两个帧缓存器中两个示例性超级块;FIG. 5 is two exemplary super blocks in two frame buffers according to an embodiment of the present invention;
图6是根据本发明实施例的可选的帧缓存器示例;FIG. 6 is an example of an optional frame buffer according to an embodiment of the present invention;
图7是描述根据本发明实施例的示例性流程的流程图。FIG. 7 is a flowchart describing an exemplary process according to an embodiment of the present invention.
具体实施方式Detailed ways
图1显示了根据本发明实施例的存储器系统100的示例性方框图。如图所示,存储器系统100可以包括存储器分配装置110、存储器控制器120和存储器130。存储器130可以包括帧缓存器131。存储器系统100用于将输入图像分割成一个或多个访问单元,并将每个已压缩访问单元(compressed access unit)存储到帧缓存器131中所分配的且用于各自的访问单元的主部分(main portion)和副部分(secondary portion)。FIG. 1 shows an exemplary block diagram of a memory system 100 according to an embodiment of the present invention. As shown, the memory system 100 may include a memory allocation device 110 , a memory controller 120 and a memory 130 . The memory 130 may include a frame buffer 131 . The memory system 100 is used to divide the input image into one or more access units and store each compressed access unit (compressed access unit) into the main portion of the frame buffer 131 allocated for the respective access unit (main portion) and sub-part (secondary portion).
存储器系统100可以是用于存储数据的任何合适的系统。在一个实施例中,存储器系统100为电子设备,例如台式电脑、平板电脑、智能手机、穿戴设备、智能TV、摄像机、摄像录像机(camcorder)、媒体播放器等。在一个示例实施例中,存储器系统100还可以包括其他组件,其访问存储器130中所存储的数据。例如,其他组件可以包括CPU 141、图像处理单元(graphics processing unit,GPU)142、多媒体引擎143、显示电路144、图像处理器145、视频编解码器146等。Memory system 100 may be any suitable system for storing data. In one embodiment, the memory system 100 is an electronic device, such as a desktop computer, a tablet computer, a smart phone, a wearable device, a smart TV, a video camera, a camcorder, a media player, and the like. In an example embodiment, memory system 100 may also include other components that access data stored in memory 130 . For example, other components may include a CPU 141 , a graphics processing unit (GPU) 142 , a multimedia engine 143 , a display circuit 144 , an image processor 145 , a video codec 146 and the like.
在一个实施例中,存储器130可以具有基于页尺寸或者通道分割而由存储器分界线(memory boundary)所隔开的存储器块(memory block)序列,例如,在每隔32字节、64字节、128字节、256字节、512字节1K字节、2K字节或者4K字节处。访问两个相邻分界线之间的存储器块内所存储的一定数量的数据比访问跨过存储器分界线的两个存储器块中所分开存储的相同数量的数据更有效。因此,当数据的起始地址与存储器分界线对齐时,存储器130中的数据可以被存储器系统100的另一组件有效地访问。存储器分界线形成在地址为存储器块尺寸的倍数处。在一个实施例中,存储器块尺寸可以容纳一定数量的数据,其可以在存储器130与其他存储器系统100的其他组件之间以突发读/写命令与单个或几个预充电命令和激活命令的序列形式被快速传输。存储器块尺寸可以基于存储器130和访问存储器130的存储器系统100的其他组件的特性而被选择,例如页尺寸和存储器130的通道分割,以及存储器130和访问存储器130的存储器系统100的其他组件的架构和操作模式。In one embodiment, memory 130 may have a sequence of memory blocks separated by memory boundaries based on page size or channel partitioning, for example, at intervals of 32 bytes, 64 bytes, 128 bytes, 256 bytes, 512 bytes 1K bytes, 2K bytes or 4K bytes. Accessing a certain amount of data stored in a memory block between two adjacent boundaries is more efficient than accessing the same amount of data stored separately in two memory blocks that straddle the memory boundary. Therefore, data in the memory 130 can be efficiently accessed by another component of the memory system 100 when the starting address of the data is aligned with a memory boundary. Memory boundaries are formed at addresses that are multiples of the memory block size. In one embodiment, the memory block size can hold a certain amount of data that can be transferred between the memory 130 and other components of the memory system 100 in a burst of read/write commands with a single or a few precharge commands and activate commands. Sequential forms are transmitted quickly. The memory block size may be selected based on characteristics of the memory 130 and other components of the memory system 100 that access the memory 130, such as the page size and channel partitioning of the memory 130, and the architecture of the memory 130 and other components of the memory system 100 that access the memory 130 and operating modes.
存储器分配装置110用于接收输入图像并将其分割成一个或多个访问单元。存储器分配装置110还用于将存储器130的部分分配给输入图像,例如帧缓存器131,并在帧缓存器131中给每个访问单元分配两个存储器部分,即主部分和副部分。在一个示例中,帧缓存器131的起始地址可以与存储器分界线对齐,例如,0字节。存储器分配装置110用于压缩每个访问单元,并将每个已压缩访问单元存储到各自的主部分,如果已压缩访问单元的尺寸超过主部分的尺寸,则将已压缩访问单元的剩余部分存储到其各自的副部分。在一个实施例中,存储器分配装置100可以被集成在访问存储器130中所存储的数据的任何组件中,例如存储器系统100的一个或多个组件,其包括CPU 141、GPU142、多媒体引擎143、显示电路144、图像处理器145、视频编解码器146等。The memory allocator 110 is used to receive an input image and divide it into one or more access units. The memory allocating device 110 is also used for allocating part of the memory 130 to the input image, such as the frame buffer 131, and assigning two memory parts to each access unit in the frame buffer 131, namely a main part and a secondary part. In one example, the start address of the frame buffer 131 may be aligned with a memory boundary, eg, 0 bytes. The memory allocator 110 is used to compress each access unit and store each compressed access unit into its respective main part, and store the remainder of the compressed access unit if the size of the compressed access unit exceeds the size of the main part to their respective subsections. In one embodiment, the memory allocator 100 may be integrated in any component that accesses data stored in the memory 130, such as one or more components of the memory system 100, including the CPU 141, the GPU 142, the multimedia engine 143, the display circuit 144, image processor 145, video codec 146, and the like.
在一个实施例中,主部分可以具有与存储器分界线对齐的起始地址和为存储器块尺寸的一倍或多倍的尺寸。因此,存储在主部分的数据可以被有效地访问。可选地,当主部分的尺寸小于存储器块尺寸时,每个主部分可以位于各自的存储器块内,而一个或多个主部分的起始地址可以与一个或多个存储器分界线对齐。In one embodiment, the main portion may have a start address aligned with a memory boundary and a size that is one or more times the size of the memory block. Therefore, data stored in the main section can be efficiently accessed. Alternatively, when the size of the main parts is smaller than the size of the memory block, each main part can be located in a respective memory block, and the start address of one or more main parts can be aligned with one or more memory boundaries.
副部分的尺寸可以为存储器块尺寸的一部分。因此,两个或者两个以上的副部分可以被组合在一起,并独立于其各自的主部分被存储。当已压缩访问单元的尺寸小于或者等于主部分的尺寸时,已压缩访问单元可以完全被存储在主部分内部,而无需使用副部分。这样,因为主部分可以被有效地访问,所以可以有效地完成访问已压缩访问单元。The size of the secondary portion may be a fraction of the size of the memory block. Thus, two or more subparts can be grouped together and stored independently of their respective main parts. When the size of the compressed access unit is smaller than or equal to the size of the main part, the compressed access unit can be completely stored inside the main part without using the sub part. In this way, accessing the compressed access unit can be done efficiently because the main part can be efficiently accessed.
在另一实施例中,副部分中的至少一个不是连续地位于在帧缓存器中其各自的主部分之后,这包括主部分具有大于其各自副部分的地址的逆顺序。In another embodiment, at least one of the subsections is not located consecutively after its respective main section in the frame buffer, which includes a reverse order in which the main section has a greater address than its respective subsection.
存储器控制器120用于管理从存储器分配装置110到存储器130的存储器访问。存储器控制器120可以用于接收来自于存储器分配装置110的请求,以将已压缩访问单元存储到存储器130的帧缓存器131中各自的主部分和副部分。基于这些请求,存储器控制器120可以用指令向存储器130发送命令,以将已压缩访问单元存储到帧缓存器131中各自的主部分和副部分。存储器控制器120也可以用于调度并缓存这些请求等。The memory controller 120 is used to manage memory access from the memory allocator 110 to the memory 130 . The memory controller 120 may be configured to receive requests from the memory allocator 110 to store the compressed access units into respective primary and secondary portions of the frame buffer 131 of the memory 130 . Based on these requests, memory controller 120 may send commands to memory 130 with instructions to store compressed access units into frame buffer 131 in respective primary and secondary portions. Memory controller 120 may also be used to schedule and cache these requests, among other things.
存储器130可以是用于存储数据的任何适当的设备。在一个实施例中,存储器130包括动态随机访问存储器(dynamic random access memory,DRAM)类型存储器模块,例如,双数据速率同步DRAM(double data rate synchronous DRAM,DDR SDRAM)、双数据速率双同步DRAM(double data rate two synchronous DRAM,DDR2 SDRAM)、双数据速率三同步DRAM(double data rate three synchronous DRAM,DDR3SDRAM)、双数据速率四同步DRAM(double data rate four synchronous DRAM,DDR4SDRAM)、低功率DDR SDRAM(low powerDDR SDRAM,LPDDR SDRAM)等。Memory 130 may be any suitable device for storing data. In one embodiment, the memory 130 includes a dynamic random access memory (dynamic random access memory, DRAM) type memory module, for example, double data rate synchronous DRAM (double data rate synchronous DRAM, DDR SDRAM), double data rate double synchronous DRAM ( double data rate two synchronous DRAM, DDR2 SDRAM), double data rate three synchronous DRAM (double data rate three synchronous DRAM, DDR3SDRAM), double data rate four synchronous DRAM (double data rate four synchronous DRAM, DDR4SDRAM), low power DDR SDRAM ( low powerDDR SDRAM, LPDDR SDRAM), etc.
在一个实施例中,存储器系统100可以是片上系统(system-on chip,SOC),其中所有组件位于单片集成电路(integrated circuit,IC)芯片上。此外,诸如CPU 141、GPU 142、多媒体引擎143、显示电路144、图像处理器145和视频编解码器146的其他组件也可以包含在相同的单片IC芯片上。可选地,存储器系统100中的组件可以跨几个IC分布。例如,存储器分配装置110、存储器控制器120、存储器130和存储器系统100的其他组件可以位于多个IC芯片上。另外,存储器分配装置110可以被集成在访问存储器130中所存储的数据的任何组件中,例如存储器系统100的一个或多个组件,其包括CPU 141、GPU 142、多媒体引擎143、显示电路144、图像处理器145、视频编解码器146等。In one embodiment, the memory system 100 may be a system-on-chip (SOC) in which all components are located on a single integrated circuit (IC) chip. In addition, other components such as CPU 141, GPU 142, multimedia engine 143, display circuit 144, image processor 145, and video codec 146 may also be contained on the same single IC chip. Alternatively, components in memory system 100 may be distributed across several ICs. For example, memory allocator 110, memory controller 120, memory 130, and other components of memory system 100 may be located on multiple IC chips. In addition, the memory allocator 110 can be integrated in any component that accesses the data stored in the memory 130, such as one or more components of the memory system 100, including a CPU 141, a GPU 142, a multimedia engine 143, a display circuit 144, An image processor 145, a video codec 146, and the like.
在操作期间,输入图像可以由存储器分配装置110接收。存储器分配装置110可以将输入图像分割成一个或多个访问单元。另外,存储器分配装置110可以将存储器130的部分,例如帧缓存器131,分配给输入图像。两个存储器部分,即主部分和副部分,被分配给帧缓存器131中的每个访问单元。在存储器分配装置110的指令下,存储器控制器120可以将已压缩访问单元存储到其各自的主部分,以及根据尺寸情况而定的副部分。主部分可以具有与存储器分界线对齐的起始地址和为存储器块尺寸的一倍或者多倍的尺寸。副部分的尺寸可以是存储器块尺寸的一部分。因此,两个或者两个以上的副部分可以被组合在一起,并独立于其各自的主部分被存储。当已压缩访问单元的尺寸小于或者等于主部分的尺寸时,已压缩访问单元可以完全被存储在主部分内部,而无需使用副部分。这样,可以有效地完成访问已压缩访问单元。During operation, an input image may be received by memory allocation device 110 . The memory allocator 110 may divide the input image into one or more access units. In addition, the memory allocating means 110 may allocate a portion of the memory 130, such as the frame buffer 131, to the input image. Two memory sections, a main section and a sub section, are allocated to each access unit in the frame buffer 131 . Under the instruction of the memory allocator 110, the memory controller 120 may store the compressed access units into their respective main parts, and depending on the size, the secondary parts. The main portion may have a start address aligned with a memory boundary and a size that is one or more times the size of the memory block. The size of the secondary portion may be a fraction of the memory block size. Thus, two or more subparts can be grouped together and stored independently of their respective main parts. When the size of the compressed access unit is smaller than or equal to the size of the main part, the compressed access unit can be completely stored inside the main part without using the sub part. In this way, accessing the compressed access unit can be done efficiently.
图2是根据本发明实施例的示例性数据结构200,其示出了被分割成访问单元的输入图像210、帧缓存器231A和帧缓存器231B。如图所示,输入图像210可以被分割成N×M的访问单元阵列。在该阵列内部,访问单元的尺寸取决于这个访问单元中像素的数量和像素位深(pixel bit-depth)。像素位深为用于指定颜色的像素的位数,例如10位或者12位,其分别对应于1024个颜色或者4049个颜色。在一个示例中,访问单元中像素的数量可以取决于存储器分配装置110所使用的压缩方法,例如压缩单元的尺寸取决于用哪个压缩方法操作。例如,压缩单元的尺寸可以为4×4像素、8×8像素、16×4像素、16×8像素、16×16像素等。访问单元可以具有一个或多个压缩单元。FIG. 2 is an exemplary data structure 200 illustrating an input image 210 partitioned into access units, a frame buffer 231A, and a frame buffer 231B, according to an embodiment of the present invention. As shown, the input image 210 may be partitioned into an NxM array of access units. Inside the array, the size of an access unit depends on the number of pixels and the pixel bit-depth in this access unit. The pixel bit depth is the number of bits for a pixel of a specified color, for example, 10 bits or 12 bits, which correspond to 1024 colors or 4049 colors, respectively. In one example, the number of pixels in an access unit may depend on the compression method used by the memory allocator 110, for example, the size of the compression unit depends on which compression method is used to operate. For example, the size of the compression unit may be 4×4 pixels, 8×8 pixels, 16×4 pixels, 16×8 pixels, 16×16 pixels and so on. An access unit can have one or more compression units.
帧缓存器231A示出了用于存储输入图像的示例性帧缓存器结构。帧缓存器可以为存储器,其具有用于存储数据的可访问位置。存储器内的可访问位置可以被组合成具有存储器块尺寸的存储器块。如上所述,存储器块尺寸可以基于存储器130和访问存储器130的存储器系统100的其他组件的特性而被选择,例如页尺寸和存储器130的通道分割,以及存储器130和访问存储器130的存储器系统100的其他组件的架构和操作模式。在一个示例中,存储器块尺寸可以被选择为32字节、64字节、128字节、256字节、512字节、1K字节、2K字节、4K字节等。例如,存储器130可以为DDR3SDRAM设备,且数据自存储器130中检索到。存储器块尺寸为数据量,其可以在单个读周期中自存储器130检索到。具体地,当数据总线宽度和突发长度(burst length)分别为64位(即8字节)和4时,存储器块尺寸可以为32字节。在另一示例中,存储器块尺寸可以由访问存储器的CPU 141或者GPU 142超高速缓存储器线来确定为64字节或者128字节等。在图2的示例中,存储器块尺寸为64字节,且因此在帧缓存器231A和帧缓存器231B中,存储器分界线250(1)-250(n)位于0字节、64字节、128字节、192字节等处。Frame buffer 231A shows an exemplary frame buffer structure for storing input images. A frame buffer may be a memory that has accessible locations for storing data. Accessible locations within memory may be grouped into memory blocks having a memory block size. As noted above, the memory block size may be selected based on characteristics of the memory 130 and other components of the memory system 100 accessing the memory 130, such as the page size and channel partitioning of the memory 130, and the characteristics of the memory 130 and the memory system 100 accessing the memory 130. Architecture and mode of operation of other components. In one example, the memory block size may be selected as 32 bytes, 64 bytes, 128 bytes, 256 bytes, 512 bytes, 1K bytes, 2K bytes, 4K bytes, etc. For example, memory 130 may be a DDR3 SDRAM device, and data is retrieved from memory 130 . A memory block size is the amount of data that can be retrieved from memory 130 in a single read cycle. Specifically, when the data bus width and the burst length (burst length) are 64 bits (ie, 8 bytes) and 4 respectively, the memory block size may be 32 bytes. In another example, the memory block size may be determined by the CPU 141 or GPU 142 cache line accessing the memory as 64 bytes or 128 bytes, etc. In the example of FIG. 2, the memory block size is 64 bytes, and thus in frame buffer 231A and frame buffer 231B, memory boundaries 250(1)-250(n) are located at 0 byte, 64 byte, 128 bytes, 192 bytes, etc.
主部分和副部分可以被分配给每个访问单元。在一个实施例中,主部分的尺寸可以取决于输入图像的可压缩性、压缩方法、存储器块尺寸等。此外,在一个实施例中,主部分的尺寸与副部分的尺寸的总和可以等于访问单元的尺寸。同样地,主部分的尺寸与副部分的尺寸的比例可以取决于输入图像的可压缩性、压缩方法、存储器块尺寸等。例如,当访问单元可以被压缩成更小尺寸时,更小的主部分足够用以存储已压缩访问单元,且各自的副部分可以保持为空,使得主部分的尺寸与副部分的尺寸的比例更小。例如,主部分的尺寸与副部分的尺寸的比例可以为2、4或8等。A main part and a sub part can be assigned to each access unit. In one embodiment, the size of the main part may depend on the compressibility of the input image, compression method, memory block size, etc. Furthermore, in one embodiment, the sum of the size of the main part and the size of the sub part may be equal to the size of the access unit. Likewise, the ratio of the size of the main part to the size of the secondary part may depend on the compressibility of the input image, the compression method, the memory block size, and the like. For example, when an access unit can be compressed to a smaller size, a smaller main section is sufficient to store the compressed access unit, and the respective subsection can be left empty such that the ratio of the size of the main section to the size of the subsection smaller. For example, the ratio of the size of the main portion to the size of the secondary portion may be 2, 4 or 8, etc.
另外,主部分可以具有与存储器分界线对齐的起始地址和为存储块尺寸的倍数的尺寸,使得存储在主部分的数据可以被有效地访问。在图2的示例中,主部分221(1)-221(3)的尺寸可以被选择成具有64字节的存储块尺寸,并且主部分221(1)-221(3)的起始地址分别与存储器分界线250(1)-250(3)对齐。各自的副部分241(1)-241(3)的尺寸可以被选择成小于64字节,例如32字节。Additionally, the main portion may have a start address aligned with a memory boundary and a size that is a multiple of the memory block size so that data stored in the main portion can be efficiently accessed. In the example of FIG. 2, the size of main parts 221(1)-221(3) may be selected to have a memory block size of 64 bytes, and the starting addresses of main parts 221(1)-221(3) are respectively Aligns with memory demarcation lines 250(1)-250(3). The size of the respective subparts 241(1)-241(3) may be chosen to be less than 64 bytes, eg 32 bytes.
帧缓存器231B显示了当具有各种尺寸的已压缩访问单元261-263被存储时的示例。已压缩访问单元261-263可以被存储在其各自的主部分221(1)-221(3)和副部分241(1)-241(3)。在图2的示例中,已压缩访问单元261的尺寸小于64字节的存储器块尺寸。因此,已压缩访问单元261可以在主部分221(1)内被存储完,而副部分241(1)保持为空。已压缩访问单元262的尺寸等于64字节的存储器块尺寸。因此,已压缩访问单元262可以在主部分221(2)内被存储完,而副部分241(2)保持为空。然而,已压缩访问单元263的尺寸大于64字节的存储器块尺寸。因此,已压缩访问单元263的第一部分可以填入主部分221(3),且已压缩访问单元263的第二部分或者剩余部分可以被存储在副部分241(3)中。Frame buffer 231B shows an example when compressed access units 261-263 of various sizes are stored. Compressed access units 261-263 may be stored in their respective primary sections 221(1)-221(3) and secondary sections 241(1)-241(3). In the example of FIG. 2, the size of the compressed access unit 261 is smaller than the memory block size of 64 bytes. Thus, compressed access unit 261 may be completely stored within primary portion 221(1), while secondary portion 241(1) remains empty. The size of the compressed access unit 262 is equal to a memory block size of 64 bytes. Thus, compressed access unit 262 may be completely stored within primary portion 221(2), while secondary portion 241(2) remains empty. However, the size of the compressed access unit 263 is larger than the memory block size of 64 bytes. Thus, a first portion of compressed access unit 263 may fill primary portion 221(3), and a second or remaining portion of compressed access unit 263 may be stored in secondary portion 241(3).
在各种实施例中,访问单元的尺寸、主部分的尺寸和副部分的尺寸可以被选择,并保持为常数以用于输入图像。另一方面,多个输入图像,例如视频的序列帧,可以由存储器系统100来存储。访问单元的尺寸、主部分的尺寸和副部分的尺寸可以被选择以用于每个单个的输入图像,因此,其可以动态地从一个输入图像到另一输入图像变化。In various embodiments, the size of the access unit, the size of the main part and the size of the sub part may be selected and kept constant for the input image. Alternatively, a plurality of input images, such as a sequence of frames of video, may be stored by the memory system 100 . The size of the access unit, the size of the main part and the size of the sub part can be selected for each individual input image, so it can vary dynamically from one input image to another.
主部分和副部分可以根据各种布局而被设置在帧缓存器131中。图3-图5显示了示例布局,其包括以周期的方式的主部分和副部分的重复模型,其中最小重复单元为超级块。因此,帧缓存器131中的主部分和副部分可以,例如,通过顺序地放置彼此相邻的超级块排列。在一个实施例中,超级块的尺寸可以为存储器块尺寸的倍数。The main part and the sub part may be arranged in the frame buffer 131 according to various layouts. Figures 3-5 show example layouts that include repeating patterns of main and sub-parts in a cyclical manner, where the smallest repeating unit is a superblock. Accordingly, the main section and the subsection in the frame buffer 131 may be arranged, for example, by sequentially placing superblocks adjacent to each other. In one embodiment, the size of a superblock may be a multiple of the memory block size.
图3是根据本发明实施例的三个帧缓存器331A-331C中三个示例性超级块341A-341C。超级块341A-341C共享相同特性,其中超级块中的所有副部分被组合成副部分组,其位于各自超级块的中间。主部分的起始地址与存储器分界线对齐。副部分组的尺寸为存储器块尺寸的一倍或者多倍,且副部分组的第一副部分与存储器分界线对齐。FIG. 3 is a diagram of three exemplary super blocks 341A-341C in three frame buffers 331A-331C according to an embodiment of the present invention. Superblocks 341A-341C share the same property, where all subparts in a superblock are combined into a subpart group, which is located in the middle of the respective superblock. The start address of the main section is aligned with the memory boundary. The size of the sub-group is one or more times the size of the memory block, and the first sub-part of the sub-group is aligned with the boundary of the memory.
参考超级块341A,访问单元的尺寸被设置成160字节,存储器块尺寸被设置成128字节,且主部分和副部分的尺寸分别被设置成128字节和32字节。如图所示,超级块模型具有包括四个副部分(即S0-S3)的副部分组,其被插入在第一主部分组(即M0-M1)与第二主部分组(即M2-M3)之间。超级块341A的尺寸为存储器块尺寸的5倍(即640字节)。超级块341A的存储器分界线位于0字节、128字节、256字节、384字节、512字节和640字节处,所有主部分M0-M3的起始地址分别与位于0字节、128字节、384字节和512字节处的存储器分界线对齐。副部分组具有128字节的尺寸,第一副部分S0与位于256字节的存储器分界线对齐。Referring to the super block 341A, the size of the access unit is set to 160 bytes, the size of the memory block is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 32 bytes, respectively. As shown, the superblock model has a subsection group consisting of four subsections (ie, S 0 -S 3 ), which are inserted between a first main section section (ie, M 0 -M 1 ) and a second main section section (ie between M 2 -M 3 ). The size of the superblock 341A is 5 times the size of the memory block (ie, 640 bytes). The memory boundary of the super block 341A is located at 0 byte, 128 byte, 256 byte, 384 byte, 512 byte and 640 byte, and the starting addresses of all main parts M 0 -M 3 are respectively located at word 0 section, 128 bytes, 384 bytes, and 512 bytes are aligned on memory boundaries. The sub-part group has a size of 128 bytes, with the first sub-part S 0 aligned to a memory boundary at 256 bytes.
参考超级块341B,访问单元的尺寸被设置成192字节,存储器块尺寸被设置成128字节,且主部分和副部分的尺寸分别被设置成128字节和64字节。如图所示,超级块模型具有包括四个副部分(即S0-S3)的副部分组,其被插入在第一主部分组(即M0-M1)与第二主部分组(即M2-M3)之间。超级块341B的尺寸为存储器块尺寸的6倍(即768字节)。超级块341B的存储器分界线位于0字节、128字节、256字节、384字节、512字节、640字节和768字节处,所有主部分M0-M3的起始地址分别与位于0字节、128字节、512字节和640字节处的存储器分界线对齐。副部分组具有256字节的尺寸,第一副部分S0与位于256字节的存储器分界线对齐。Referring to the super block 341B, the size of the access unit is set to 192 bytes, the size of the memory block is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 64 bytes, respectively. As shown, the superblock model has a subsection group consisting of four subsections (ie, S 0 -S 3 ), which are inserted between a first main section section (ie, M 0 -M 1 ) and a second main section section (ie between M 2 -M 3 ). The size of the superblock 341B is 6 times the size of the memory block (ie, 768 bytes). The memory boundary of the super block 341B is located at 0 byte, 128 byte, 256 byte, 384 byte, 512 byte, 640 byte and 768 byte, and the starting addresses of all main parts M 0 -M 3 are respectively Aligned to memory boundaries at 0 bytes, 128 bytes, 512 bytes, and 640 bytes. The sub-part group has a size of 256 bytes, and the first sub-part S 0 is aligned with a memory boundary at 256 bytes.
参考超级块341C,访问单元的尺寸被设置成384字节,存储器块尺寸被设置成256字节,且主部分和副部分的尺寸分别被设置成256字节和128字节。如图所示,超级块模型具有包括两个副部分(即S0-S1)的副部分组,其被插入在第一主部分M0与第二主部分M1之间。超级块341C的尺寸为存储器块尺寸的3倍(即768字节)。超级块341C的存储器分界线位于0字节、256字节、512字节和768字节处,主部分M0-M1的起始地址分别与位于0字节的存储器分界线和位于512字节存储器分界线对齐。副部分组具有256字节的尺寸,第一副部分S0与位于256字节的存储器分界线对齐。Referring to the super block 341C, the size of the access unit is set to 384 bytes, the memory block size is set to 256 bytes, and the sizes of the main part and the sub part are set to 256 bytes and 128 bytes, respectively. As shown, the superblock model has a subsection group comprising two subsections (ie S 0 -S 1 ) inserted between a first main section M 0 and a second main section M 1 . The size of the superblock 341C is three times the size of the memory block (ie, 768 bytes). The memory boundary of super block 341C is positioned at 0 byte, 256 bytes, 512 bytes and 768 bytes, and the starting address of main part M 0 -M 1 is respectively at the memory boundary of 0 byte and at 512 word Section memory boundaries are aligned. The sub-part group has a size of 256 bytes, and the first sub-part S 0 is aligned with a memory boundary at 256 bytes.
图4是根据本发明实施例的三个帧缓存器431A-431C中三个示例性超级块441A-441C。超级块441A-441C共享相同特性,其中超级块中的所有副部分被组合成副部分组,其跟随着包含主部分的主部分组。主部分的起始地址与存储器分界线对齐。副部分组的尺寸为存储器块尺寸的一倍或者多倍,且副部分组的第一副部分与存储器分界线对齐。FIG. 4 illustrates three exemplary superblocks 441A-441C in three frame buffers 431A-431C according to an embodiment of the present invention. Superblocks 441A-441C share the same property, where all secondary parts in a superblock are combined into a secondary part group, which is followed by a main part group containing the main part. The start address of the main section is aligned with the memory boundary. The size of the sub-group is one or more times the size of the memory block, and the first sub-part of the sub-group is aligned with the boundary of the memory.
参考超级块441A,访问单元的尺寸被设置成192字节,存储器块尺寸被设置成128字节,且主部分和副部分的尺寸分别被设置成128字节和64字节。如图所示,超级块模型441A具有包括两个副部分(即S0-S1)的副部分组,其跟随着主部分组(即M0-M1)。超级块441A的尺寸为存储器块尺寸的3倍(即384字节)。超级块441A的存储器分界线位于0字节、128字节、256字节和384字节处,所有主部分M0-M1的起始地址分别与位于0字节的存储器分界线和位于128字节的存储器分界线对齐。副部分组具有128字节的尺寸,第一副部分S0与位于256字节的存储器分界线对齐。Referring to the super block 441A, the size of the access unit is set to 192 bytes, the size of the memory block is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 64 bytes, respectively. As shown, the superblock model 441A has a subsection group consisting of two subsections (ie, S 0 -S 1 ), followed by a main section group (ie, M 0 -M 1 ). The size of the superblock 441A is three times the size of the memory block (ie, 384 bytes). The memory boundaries of the super block 441A are located at 0 bytes, 128 bytes, 256 bytes and 384 bytes, and the starting addresses of all main parts M 0 -M 1 are respectively connected to the memory boundaries at 0 bytes and at 128 bytes. Bytes are aligned on memory boundaries. The sub-part group has a size of 128 bytes, with the first sub-part S 0 aligned to a memory boundary at 256 bytes.
参考超级块441B,访问单元的尺寸被设置成160字节,存储器块尺寸被设置成128字节,且主部分和副部分的尺寸分别被设置成128字节和32字节。如图所示,超级块模型具有包括四个副部分(即S0-S3)的副部分组,其跟随着主部分组(即M0-M3)。超级块441B的尺寸为存储器块尺寸的5倍(即640字节)。超级块441B的存储器分界线位于0字节、128字节、256字节、384字节、512字节和640字节处,所有主部分M0-M3的起始地址分别与位于0字节、128字节、256字节和384字节处的存储器分界线对齐。副部分组具有256字节的尺寸,第一副部分S0与位于512字节处的存储器分界线对齐。Referring to the super block 441B, the size of the access unit is set to 160 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 32 bytes, respectively. As shown, the superblock model has a subsection group consisting of four subsections (ie, S 0 -S 3 ), followed by a main section group (ie, M 0 -M 3 ). The size of the superblock 441B is 5 times the size of the memory block (ie, 640 bytes). The memory boundary of the super block 441B is located at 0 byte, 128 byte, 256 byte, 384 byte, 512 byte and 640 byte, and the starting addresses of all main parts M 0 -M 3 are respectively located at word 0 memory boundaries at 128 bytes, 256 bytes, and 384 bytes. The sub-part group has a size of 256 bytes, with the first sub-part S 0 aligned to the memory boundary at 512 bytes.
参考超级块441C,访问单元的尺寸被设置成320字节,存储器块尺寸被设置成128字节,且主部分和副部分的尺寸分别被设置成256字节和64字节。如图所示,超级块模型具有包括两个副部分(即S0-S1)的副部分组,其跟随着主部分组(即M0-M1)。超级块441C的尺寸为存储器块尺寸的5倍(即640字节)。超级块441C的存储器分界线位于0字节、128字节、256字节、512字节和640字节处,所有主部分M0-M1的起始地址分别与位于0字节和256字节处的存储器分界线对齐。副部分组具有256字节的尺寸,第一副部分S0与位于512字节的存储器分界线对齐。Referring to the super block 441C, the size of the access unit is set to 320 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 256 bytes and 64 bytes, respectively. As shown, the superblock model has a subsection group consisting of two subsections (ie, S 0 -S 1 ), followed by a main section group (ie, M 0 -M 1 ). The size of the superblock 441C is 5 times the size of the memory block (ie, 640 bytes). The memory boundary of the super block 441C is located at 0 byte, 128 byte, 256 byte, 512 byte and 640 byte, and the starting addresses of all main parts M 0 -M 1 are respectively located at 0 byte and 256 byte Alignment on memory boundaries at sections. The sub-part group has a size of 256 bytes, with the first sub-part S 0 aligned to a memory boundary at 512 bytes.
图5是根据本发明实施例的两个帧缓存器531A-531B中两个示例性超级块541A-541B。超级块541A-541C共享相同特性,其中主部分的尺寸(即128字节)小于存储器块尺寸(即256字节)。另外,一些已压缩访问单元可需要被存储在主部分和副部分,而一些已压缩访问单元可以在主部分中被存储完。为了允许有效的访问存储在主部分和副部分中的已压缩访问单元,对应于主部分的尽可能多的副部分可以被包含在相同的存储器块中,并且优选地,其紧跟于各自的主部分。例如,在超级块541A中,在其各自的存储器块中,主部分M0被其副部分S0紧跟,主部分M3被其副部分S3紧跟。FIG. 5 is a diagram of two exemplary superblocks 541A-541B in two frame buffers 531A-531B according to an embodiment of the present invention. Superblocks 541A-541C share the same characteristic, where the size of the main portion (ie, 128 bytes) is smaller than the memory block size (ie, 256 bytes). In addition, some compressed access units may need to be stored in both the main part and the secondary part, while some compressed access units may be completely stored in the main part. In order to allow efficient access to the compressed access units stored in the primary and secondary parts, as many secondary parts as possible corresponding to the primary part can be contained in the same memory block, and preferably immediately following the respective main part. For example, in superblock 541A, main portion M 0 is followed by its secondary portion S 0 , and main portion M 3 is followed by its secondary portion S 3 in their respective memory blocks.
参考超级块541A,访问单元的尺寸被设置成192字节,存储器块尺寸被设置成256字节,且主部分和副部分的尺寸分别被设置成128字节和64字节。如图所示,超级块模型具有副部分S0,其跟随着各自主部分M0,以及副部分S3,其跟随着各自主部分M3。超级块541A的尺寸为存储器块尺寸的3倍(即768字节)。超级块541A的存储器分界线位于0字节、256字节、512字节和768字节处,三个主部分M0、M1和M3的起始地址分别与位于0字节、256字节和512字节处的存储器分界线对齐。主部分M2不与存储器分界线对齐,但是主部分M2位于256字节与512字节之间的单个存储器块内部。Referring to the super block 541A, the size of the access unit is set to 192 bytes, the size of the memory block is set to 256 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 64 bytes, respectively. As shown, the superblock model has a subsection S 0 that follows a respective main section M 0 , and a subsection S 3 that follows a respective main section M 3 . The size of the superblock 541A is three times the size of the memory block (ie, 768 bytes). The memory boundary of the super block 541A is located at 0 byte, 256 bytes, 512 bytes and 768 bytes, and the starting addresses of the three main parts M 0 , M 1 and M 3 are respectively located at 0 byte, 256 bytes Sections and memory boundaries at 512 bytes are aligned. The main portion M2 is not aligned with memory boundaries, but is located inside a single block of memory between 256 bytes and 512 bytes.
参考超级块541B,访问单元的尺寸被设置成192字节,存储器块尺寸被设置成256字节,且主部分和副部分的尺寸分别被设置成128字节和64字节。如图所示,超级块模型具有跟随着其主部分M0的副部分S0,以及跟随着其主部分M1的副部分S1。超级块541B的尺寸为存储器块尺寸的3倍(即768字节)。超级块541A的存储器分界线位于0字节、256字节、512字节和768字节处,三个主部分M0、M1和M2的起始地址分别与位于0字节、256字节和512字节处的存储器分界线对齐。主部分M3不与存储器分界线对齐,但是主部分M3位于512字节与768字节之间的单个存储器块内部。Referring to the super block 541B, the size of the access unit is set to 192 bytes, the memory block size is set to 256 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 64 bytes, respectively. As shown, the superblock model has a subsection S0 following its main section M0 , and a subsection S1 following its main section M1. The size of the superblock 541B is three times the size of the memory block (ie, 768 bytes). The memory boundary of the super block 541A is located at 0 byte, 256 bytes, 512 bytes and 768 bytes, and the starting addresses of the three main parts M 0 , M 1 and M 2 are respectively located at 0 byte, 256 bytes Sections and memory boundaries at 512 bytes are aligned. The main portion M3 is not aligned with memory boundaries, but is located within a single block of memory between 512 bytes and 768 bytes.
图6显示了根据本发明实施例的可选的帧缓存器示例。通过具有两组,即主部分组和副部分组,可以排列帧缓存器631A和帧缓存器631B的主部分和副部分,其中主部分组包括顺序地放置彼此相邻的所有主部分,且副部分组包括顺序地放置彼此相邻的所有副部分。在一个实施例中,主部分的尺寸为存储器块尺寸的一倍或多倍,主部分的起始地址可以与存储器分界线对齐。主部分组可以被放置成与副部分组相邻,或者可以与副部分分开。Figure 6 shows an example of an optional frame buffer according to an embodiment of the present invention. The main and sub-sections of frame buffer 631A and frame buffer 631B can be arranged by having two sets, a main sub-set comprising all of the main sub-sections placed sequentially adjacent to each other, and sub-sections A section group consists of all subsections placed next to each other in sequence. In one embodiment, the size of the main part is one or more times the size of the memory block, and the start address of the main part may be aligned with the boundary of the memory. The main subgroup may be placed adjacent to the subgroup, or may be separated from the subgroup.
如帧缓存器631A所示,访问单元的尺寸被设置成80字节,存储器块尺寸被设置成64字节,且主部分和副部分的尺寸分别被设置成64字节和16字节。主部分组包括所有主部分。如图所示,主部分被放置成相互相邻,且具有起始地址,其与位于0字节、64字节、128字节、192字节、256字节等处的连续的存储器分界线对齐。副部分组包括所有副部分。第一副部分S0可以具有与存储器分界线对齐的起始地址,例如与位于512字节处的存储器分界线对齐。As shown in the frame buffer 631A, the size of the access unit is set to 80 bytes, the size of the memory block is set to 64 bytes, and the sizes of the main part and the sub part are set to 64 bytes and 16 bytes, respectively. The main section group includes all main sections. As shown, the main sections are placed adjacent to each other and have start addresses that correspond to consecutive memory boundaries at 0 bytes, 64 bytes, 128 bytes, 192 bytes, 256 bytes, etc. align. The subsection group includes all subsections. The first secondary portion S 0 may have a start address aligned with a memory boundary, for example aligned with a memory boundary at 512 bytes.
如帧缓存器631B所示,访问单元的尺寸被设置成160字节,存储器块尺寸被设置成128字节,且主部分和副部分的尺寸分别被设置成128字节和32字节。主部分组包括所有主部分。如图所示,主部分被放置成相互相邻,且具有起始地址,其与位于0字节、128字节、256字节、384字节、512字节等处的连续的存储器分界线对齐。副部分组包括所有副部分。第一副部分S0可以具有与存储器分界线对齐的起始地址,例如与位于4096字节处的存储器分界线对齐。As shown in the frame buffer 631B, the size of the access unit is set to 160 bytes, the size of the memory block is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 32 bytes, respectively. The main section group includes all main sections. As shown, the main sections are placed adjacent to each other with start addresses that correspond to consecutive memory boundaries at 0 bytes, 128 bytes, 256 bytes, 384 bytes, 512 bytes, etc. align. The subsection group includes all subsections. The first secondary portion S 0 may have a start address aligned with a memory boundary, for example aligned with a memory boundary at 4096 bytes.
在如图3-图6所示的超级块和帧缓存器中,至少一个副部分不是顺序位于其各自的主部分之后。In the superblocks and framebuffers as shown in FIGS. 3-6, at least one secondary section is not sequentially located after its respective main section.
在一个实施例中,超级块和帧缓存器的起始地址可以与存储器130的存储器分界线对齐,例如如图3-图6所示的0字节处。In one embodiment, the start addresses of the super block and the frame buffer may be aligned with the memory boundary of the memory 130 , for example, at byte 0 as shown in FIGS. 3-6 .
尽管示例性超级块和帧缓存器如图3-图6所示,应该理解的是,为了满足不同存储器使用情景,诸如超级块模型的变形、帧存储器中的超级块的位置等的变形是可能的。Although exemplary superblocks and frame buffers are shown in FIGS. 3-6, it should be understood that variations such as variations of the superblock model, locations of superblocks in frame memory, etc., are possible in order to accommodate different memory usage scenarios. of.
在操作期间,当主部分和副部分根据布局被放置在帧缓存器131中时,例如如图3-图6中所示的布局,已压缩访问单元可以被存储在各自的主部分中。例如,已压缩访问单元可以被存储在各自的主部分中,并且如果需要,被存储在各自的副部分中。当已压缩访问单元的尺寸等于或者小于各自主部分的尺寸时,已压缩访问单元可以完全被存储在各自主部分中,且相应的副部分可以保持为空。During operation, when the primary and secondary portions are placed in the frame buffer 131 according to a layout, such as the layouts shown in FIGS. 3-6 , compressed access units may be stored in the respective primary portions. For example, compressed access units may be stored in respective main sections and, if desired, in respective secondary sections. When the size of the compressed access unit is equal to or smaller than the size of the respective main part, the compressed access unit may be completely stored in the respective main part, and the corresponding sub part may remain empty.
图7显示了描述根据本发明实施例的示例性流程700的流程图。在一个示例中,流程700由图1中的存储器系统100执行。流程开始于步骤S701并继续到步骤S710。FIG. 7 shows a flowchart describing an exemplary process 700 according to an embodiment of the invention. In one example, the process 700 is performed by the memory system 100 in FIG. 1 . The process starts at step S701 and continues to step S710.
在步骤S710中,将输入图像分割成一个或者多个访问单元,例如如图2所示的N×M阵列的访问单元。在一个示例中,存储器分配装置110用于将输入图像分割成阵列的访问单元。输入图像可以为视频帧、摄影图像、图形图像、动画图像等。例如视频帧可以为视频编解码器146所使用的参考帧。随后流程继续到步骤S720。In step S710, the input image is divided into one or more access units, for example, an N×M array of access units as shown in FIG. 2 . In one example, the memory allocator 110 is used to partition the input image into access units of the array. The input image may be a video frame, a photographic image, a graphic image, an animated image, or the like. For example, a video frame may be a reference frame used by the video codec 146 . Then the flow continues to step S720.
在步骤S720中,在存储器中分配一帧缓存器。在一个示例中,存储器分配装置110用于在存储器130中分配帧缓存器131。帧缓存器的尺寸等于或者大于输入图像的尺寸。在一个实施例中,帧缓存器的起始地址可以与存储器分界线对齐,例如0字节处的存储器分界线。In step S720, a frame buffer is allocated in the memory. In one example, the memory allocator 110 is used to allocate the frame buffer 131 in the memory 130 . The size of the frame buffer is equal to or larger than the size of the input image. In one embodiment, the start address of the frame buffer may be aligned with a memory boundary, for example, the memory boundary at byte 0.
在步骤S730中,在帧缓存器中给每个访问单元分配两个存储器部分,即主部分和副部分。在一个示例中,存储器分配装置用于在帧缓存器131中给每个访问单元分配主部分和副部分。在一个实施例中,主部分的尺寸与副部分的尺寸的总和可以等于访问单元的尺寸,例如,可以为未压缩的访问单元的尺寸。在一个实施例中,主部分的尺寸可以取决于输入图像的压缩性、压缩方法、存储器块尺寸等。此外,在一个实施例中,主部分的尺寸与副部分的尺寸的比例可以取决于输入图像的压缩性、压缩方法等。例如,当访问单元可以被压缩成更小尺寸时,更小的主部分足够用于存储已压缩访问单元,且各自副部分可以保持为空,使得主部分的尺寸与副部分的尺寸的比例更小。In step S730, each access unit is allocated two memory sections in the frame buffer, namely a main section and a secondary section. In one example, the memory allocator is used to allocate a main part and a secondary part to each access unit in the frame buffer 131 . In one embodiment, the sum of the size of the primary part and the size of the secondary part may be equal to the size of the access unit, eg, may be the size of an uncompressed access unit. In one embodiment, the size of the main part may depend on the compressibility of the input image, compression method, memory block size, etc. Furthermore, in one embodiment, the ratio of the size of the main part to the size of the sub part may depend on the compressibility of the input image, the compression method, and the like. For example, when an access unit can be compressed to a smaller size, a smaller main section is sufficient to store the compressed access unit, and the respective subsections can be left empty, so that the ratio of the size of the main section to the size of the subsection is better. Small.
进一步地,在一个实施例中,主部分可以具有与存储器分界线对齐的起始地址和为存储器块尺寸的一倍或者多倍的尺寸,使得主部分中所存储的数据可以被有效地访问。Further, in one embodiment, the main portion may have a start address aligned with a memory boundary and a size that is one or more times the size of a memory block, so that data stored in the main portion can be efficiently accessed.
可选地,当主部分的尺寸小于存储器块尺寸时,每个主部分可以位于各自的存储器块之内,而一个或多个主部分可以具有与一个或多个存储器分界线对齐的起始地址。Alternatively, when the size of the main sections is smaller than the memory block size, each main section may be located within a respective memory block, and one or more main sections may have a start address aligned with one or more memory boundaries.
在一个实施例中,副部分的尺寸可以为存储器块尺寸的一部分。因此,两个或者两个以上副部分可以被组合在一起作为一个或多个副部分组,并与其各自的主部分分开进行存储。进一步,在一个实施例中,在每个各自的副部分组中的第一副部分可以具有与存储器分界线对齐的起始地址。In one embodiment, the size of the secondary portion may be a fraction of the size of the memory block. Thus, two or more sub-parts may be grouped together as one or more sub-part groups and stored separately from their respective main parts. Further, in one embodiment, the first sub-part in each respective sub-part group may have a start address aligned with a memory boundary.
在另一实施例中,在帧缓存器中,至少一个副部分不是顺序地位于其各自主部分之后。In another embodiment, at least one secondary section is not sequentially located after its respective main section in the frame buffer.
主部分和副部分可根据不同的布局,排列在帧缓存器中,例如帧缓存器131中。在一个实施例中,布局可以包括超级块的重复模型,其中超级块为帧缓存器中的最小重复单元。因此,帧缓存器中的主部分和副部分可以,例如,通过顺序地放置彼此相邻的超级块排列。超级块的尺寸可以被设置成存储器块尺寸的倍数。The main part and the secondary part can be arranged in the frame buffer, such as the frame buffer 131, according to different layouts. In one embodiment, the layout may include a repeating model of a superblock, where a superblock is the smallest repeating unit in a frame buffer. Thus, the primary and secondary sections in the frame buffer can be arranged, for example, by sequentially placing superblocks next to each other. The size of a superblock can be set as a multiple of the memory block size.
在一个实施例中,超级块中的主部分的起始地址与存储器分界线对齐。超级块中的副部分可以被组合成一个或多个副部分组,其尺寸为存储器块尺寸的倍数。每个副部分组中的第一副部分可以与存储器分界线对齐。具有上述特性的一些示例性超级块如图3和图4所示。In one embodiment, the start address of the main portion in the superblock is aligned with a memory boundary. Subparts in a superblock may be grouped into one or more subpart groups whose size is a multiple of the memory block size. A first sub-part in each sub-part group may be aligned with a memory boundary. Some exemplary superblocks with the above characteristics are shown in FIGS. 3 and 4 .
在另一实施例中,超级块可以具有一个或多个主部分,其尺寸小于存储器块尺寸。一些示例性超级块如图5所示。例如,尽可能多的副部分紧跟各自的主部分(例如,图5的超级块541A中S0跟随着M0,且S3跟随着M3)。又例如,每个主部分完全位于相同的存储器块中。In another embodiment, a superblock may have one or more main parts whose size is smaller than the memory block size. Some exemplary superblocks are shown in FIG. 5 . For example, as many secondary parts as possible immediately follow the respective main part (eg, S 0 follows M 0 and S 3 follows M 3 in superblock 541A of FIG. 5 ). As another example, each main portion is located entirely in the same memory block.
在一个实施例中,布局不包括超级块的重复模型。相反,通过具有主部分组和副部分组,可以排列帧缓存器中的主部分和副部分,例如如图6所示的示例。例如,主部分组包括具有与连续的存储器分界线对齐的起始地址的主部分。副部分组包括相互相邻的副部分。副部分组的第一副部分可以与存储器分界线对齐。In one embodiment, the layout does not include a repeating model of the superblock. Conversely, by having main and sub-section groups, the main and sub-sections in the frame buffer can be arranged, such as the example shown in FIG. 6 . For example, a main part group includes a main part with a start address aligned with a contiguous memory boundary. A subpart group includes subparts adjacent to each other. A first sub-part of the sub-part group may be aligned with a memory boundary.
在步骤S740中,可以将访问单元压缩成已压缩访问单元,以降低存储器与访问存储器的另一设备之间数据传输的带宽要求。例如,在存储器系统100中,存储器130可以位于与存储器分配装置110不同的芯片上,存储器分配装置110用于压缩访问单元以降低存储器130与存储器分配装置110之间数据传输的带宽要求。无损压缩方法和有损压缩方法均可以用于压缩访问单元。无损压缩方法可以保护原始数据的质量,而有损压缩方法可以实现更多的压缩。压缩方法可以是通用的压缩方法、图像压缩方法或者视频压缩方法等。例如,压缩方法可以包括游程长度编码(run-length encoding)、基于字典的算法、Hoffman编码、缩小化(deflation)、色度子采样、离散余弦变换等。In step S740, the access unit may be compressed into a compressed access unit, so as to reduce bandwidth requirements for data transmission between the memory and another device accessing the memory. For example, in memory system 100 , memory 130 may be located on a different chip than memory allocator 110 , and memory allocator 110 is used to compress access units to reduce bandwidth requirements for data transfer between memory 130 and memory allocator 110 . Both lossless and lossy compression methods can be used to compress the access unit. Lossless compression methods preserve the quality of the original data, while lossy compression methods achieve more compression. The compression method may be a general compression method, an image compression method, or a video compression method. For example, compression methods may include run-length encoding, dictionary-based algorithms, Hoffman encoding, deflation, chroma subsampling, discrete cosine transform, and the like.
在步骤S750中,将已压缩访问单元的尺寸与主部分的尺寸进行比较。在一个示例中,存储器分配装置110用于比较已压缩访问单元的尺寸与主部分的尺寸。如果已压缩访问单元的尺寸大于于主部分的尺寸,则流程继续到步骤S770。否则,流程继续到步骤S760。In step S750, the size of the compressed access unit is compared with the size of the main part. In one example, the memory allocator 110 is configured to compare the size of the compressed access unit with the size of the main part. If the size of the compressed access unit is larger than the size of the main part, the flow continues to step S770. Otherwise, the process continues to step S760.
在步骤S760中,可以将已压缩访问单元完全存储在各自的主部分,是因为已压缩访问单元的尺寸小于于或者等于主部分的尺寸。在一个示例中,存储器控制器120用于响应于存储器分配装置110的指令而将已压缩访问单元存储到其各自的主部分。In step S760, the compressed access units can be completely stored in the respective main parts because the size of the compressed access units is smaller than or equal to the size of the main part. In one example, memory controller 120 is configured to store compressed access units to their respective main portions in response to instructions from memory allocator 110 .
当已压缩访问单元的尺寸大于主部分的尺寸时,流程继续到步骤S770。在步骤S770中,可以将已压缩访问单元的第一部分存储到各自主部分。已压缩访问单元的第一部分的尺寸可以与主部分的尺寸相同,并填入各自主部分。在一个示例中,存储器控制器120用于响应于存储器分配装置110的指令而将已压缩访问单元的第一部分存储到其各自的主部分。When the size of the compressed access unit is larger than the size of the main part, the flow proceeds to step S770. In step S770, the first part of the compressed access unit may be stored to the respective main part. The first part of the compressed access unit may be the same size as the main part and fills the respective main part. In one example, the memory controller 120 is configured to store the first portion of the compressed access unit to its respective main portion in response to an instruction of the memory allocator 110 .
在步骤S780中,可以将已压缩访问单元的第二部分或者剩余部分存储到各自副部分。因此,当已压缩访问单元的尺寸大于主部分的尺寸时,可以将已压缩访问单元分开存储到各自主部分和副部分。在一个示例中,存储器控制器120用于响应于存储器分配装置110的指令而将已压缩访问单元的剩余部分存储到各自副部分。In step S780, the second or remaining part of the compressed access unit may be stored to the respective secondary part. Therefore, when the size of the compressed access unit is larger than the size of the main part, the compressed access unit can be stored separately into the main part and the sub part. In one example, the memory controller 120 is configured to store the remaining portions of the compressed access units to respective secondary portions in response to instructions from the memory allocator 110 .
在流程继续到流程结束的步骤S799之前,可以对所有的访问单元重复执行步骤S740-步骤S780。在一个示例中,存储器分配装置110和存储器控制器120用于重复执行步骤S740-步骤S780,以用于输入图像中的所有访问单元。Before the flow continues to step S799 where the flow ends, steps S740-S780 may be repeatedly executed for all access units. In one example, the memory allocation device 110 and the memory controller 120 are used to repeatedly execute step S740-step S780 for all access units in the input image.
在各种实施例中,访问单元的尺寸、主部分的尺寸和副部分的尺寸可以被选择,并保持为常数以用于输入图像。另一方面,多个输入图像,例如视频的序列帧,可以由存储器来存储。访问单元的尺寸、主部分的尺寸和副部分的尺寸可以被选择以用于每个单独的输入图像,因此,其可以动态地从一个输入图像到另一输入图像变化。In various embodiments, the size of the access unit, the size of the main part and the size of the sub part may be selected and kept constant for the input image. Alternatively, a plurality of input images, such as a sequence of frames of a video, may be stored by the memory. The size of the access unit, the size of the main part and the size of the sub part can be selected for each individual input image, so it can vary dynamically from one input image to another.
在各种示例中,存储器分配装置110或者存储器分配装置110的功能可以用硬件、软件及其组合来实现。在一个示例中,存储器分配装置110在硬件中实现,例如处理电路,硬件可以包括离散组件、集成电路、应用专用集成电路(application-specific integratedcircuit,ASIC)等中的一个或多个。在另一示例中,存储器分布的功能可以用包括存储在计算机可读非暂态存储介质的指令的软件或者固件来实现,当这些指令由处理电路执行时,使得处理电路执行各自的功能。In various examples, the memory allocating device 110 or the functions of the memory allocating device 110 may be implemented by hardware, software or a combination thereof. In one example, the memory allocating device 110 is implemented in hardware, such as a processing circuit, and the hardware may include one or more of discrete components, integrated circuits, application-specific integrated circuits (ASICs), and the like. In another example, the functions of the memory distribution may be implemented by software or firmware including instructions stored in a computer-readable non-transitory storage medium, which when executed by the processing circuit, cause the processing circuit to perform the respective functions.
由于已经结合本发明的被提出用作示例的具体实施例描述了本发明的各个方面,可以做出这些示例的替代、修改和变形。因此,此处所说明的实施例用作示意目的,但不用于限制。在不脱离权利要求的范围的情况下,可以做出改变。Since various aspects of the invention have been described in conjunction with specific embodiments of the invention presented as examples, substitutions, modifications and variations of these examples may be made. Accordingly, the embodiments described herein are intended to be illustrative and not limiting. Changes may be made without departing from the scope of the claims.
Claims (20)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762489588P | 2017-04-25 | 2017-04-25 | |
US62/489,588 | 2017-04-25 | ||
US15/786,908 | 2017-10-18 | ||
US15/786,908 US20180107616A1 (en) | 2016-10-18 | 2017-10-18 | Method and device for storing an image into a memory |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108804508A true CN108804508A (en) | 2018-11-13 |
CN108804508B CN108804508B (en) | 2022-06-07 |
Family
ID=64094364
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810344898.2A Active CN108804508B (en) | 2017-04-25 | 2018-04-17 | A method and system for storing input images |
CN201810373709.4A Expired - Fee Related CN108833922B (en) | 2017-04-25 | 2018-04-24 | Method for accessing frame buffer, method and apparatus for processing access unit |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810373709.4A Expired - Fee Related CN108833922B (en) | 2017-04-25 | 2018-04-24 | Method for accessing frame buffer, method and apparatus for processing access unit |
Country Status (2)
Country | Link |
---|---|
CN (2) | CN108804508B (en) |
TW (2) | TW201839714A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4772956A (en) * | 1987-06-02 | 1988-09-20 | Eastman Kodak Company | Dual block still video compander processor |
CN101499097A (en) * | 2009-03-16 | 2009-08-05 | 浙江工商大学 | Hash table based data stream frequent pattern internal memory compression and storage method |
GB2457262A (en) * | 2008-02-08 | 2009-08-12 | Linear Algebra Technologies | Compression / decompression of data blocks, applicable to video reference frames |
CN102740074A (en) * | 2012-06-05 | 2012-10-17 | 沙基昌 | Video data compressing/decompressing method and system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7106909B2 (en) * | 2001-12-25 | 2006-09-12 | Canon Kabushiki Kaisha | Method and apparatus for encoding image data in accordance with a target data size |
CN101212680B (en) * | 2006-12-30 | 2011-03-23 | 扬智科技股份有限公司 | Memory access method and system for image data |
US20140219361A1 (en) * | 2013-02-01 | 2014-08-07 | Samplify Systems, Inc. | Image data encoding for access by raster and by macroblock |
CN105472442B (en) * | 2015-12-01 | 2018-10-23 | 上海交通大学 | Compressibility is cached outside a kind of piece for ultra high-definition frame rate up-conversion |
-
2018
- 2018-04-17 CN CN201810344898.2A patent/CN108804508B/en active Active
- 2018-04-18 TW TW107113191A patent/TW201839714A/en unknown
- 2018-04-24 CN CN201810373709.4A patent/CN108833922B/en not_active Expired - Fee Related
- 2018-04-24 TW TW107113842A patent/TW201840177A/en unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4772956A (en) * | 1987-06-02 | 1988-09-20 | Eastman Kodak Company | Dual block still video compander processor |
GB2457262A (en) * | 2008-02-08 | 2009-08-12 | Linear Algebra Technologies | Compression / decompression of data blocks, applicable to video reference frames |
CN101499097A (en) * | 2009-03-16 | 2009-08-05 | 浙江工商大学 | Hash table based data stream frequent pattern internal memory compression and storage method |
CN102740074A (en) * | 2012-06-05 | 2012-10-17 | 沙基昌 | Video data compressing/decompressing method and system |
Also Published As
Publication number | Publication date |
---|---|
CN108833922B (en) | 2020-12-18 |
CN108833922A (en) | 2018-11-16 |
TW201840177A (en) | 2018-11-01 |
CN108804508B (en) | 2022-06-07 |
TW201839714A (en) | 2018-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10366467B1 (en) | Method and apparatus for accessing compressed data and/or uncompressed data of image frame in frame buffer | |
US11023152B2 (en) | Methods and apparatus for storing data in memory in data processing systems | |
CN105431831B (en) | Data access method and data access device using the same | |
CN102915280B (en) | For the configurable buffer allocation methods, devices and systems of multiple format video process | |
TWI744289B (en) | A central processing unit (cpu)-based system and method for providing memory bandwidth compression using multiple last-level cache (llc) lines | |
US8918589B2 (en) | Memory controller, memory system, semiconductor integrated circuit, and memory control method | |
EP2616945B1 (en) | Allocation of memory buffers in computing system with multiple memory channels | |
KR101773396B1 (en) | Graphic Processing Apparatus and Method for Decompressing to Data | |
CN102314400B (en) | Method and device for dispersing converged DMA (Direct Memory Access) | |
CN104952088B (en) | A kind of method for being compressed and decompressing to display data | |
US10216412B2 (en) | Data processing systems | |
US10249269B2 (en) | System on chip devices and operating methods thereof | |
CN101212680B (en) | Memory access method and system for image data | |
US20180107616A1 (en) | Method and device for storing an image into a memory | |
CN108804508B (en) | A method and system for storing input images | |
US20210200679A1 (en) | System and method for mixed tile-aware and tile-unaware traffic through a tile-based address aperture | |
CN107204199A (en) | Semiconductor memory device and address control method thereof | |
US12361508B2 (en) | Methods of and apparatus for storing data in memory in graphics processing systems | |
CN116523729B (en) | Graphics processing device, graphics rendering pipeline distribution method and related devices | |
US9864699B1 (en) | Method and apparatus for compressing LUT | |
US11907855B2 (en) | Data transfers in neural processing | |
CN113873251A (en) | Multi-channel panchromatic multispectral image compression scheduling method for address partition management | |
CN106919514A (en) | Semiconductor device, data processing system, and semiconductor device control method | |
CN114697675A (en) | Decoding display system and its memory access method | |
US11048413B2 (en) | Method for reducing read ports and accelerating decompression in memory systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |