[go: up one dir, main page]

CN109089124B - A method and device for reusing inter-frame data for motion estimation - Google Patents

A method and device for reusing inter-frame data for motion estimation Download PDF

Info

Publication number
CN109089124B
CN109089124B CN201811018540.7A CN201811018540A CN109089124B CN 109089124 B CN109089124 B CN 109089124B CN 201811018540 A CN201811018540 A CN 201811018540A CN 109089124 B CN109089124 B CN 109089124B
Authority
CN
China
Prior art keywords
frames
frame
data
current
motion estimation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811018540.7A
Other languages
Chinese (zh)
Other versions
CN109089124A (en
Inventor
徐卫志
郭元元
于惠
陆佃杰
张宇昂
刘方爱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN201811018540.7A priority Critical patent/CN109089124B/en
Publication of CN109089124A publication Critical patent/CN109089124A/en
Application granted granted Critical
Publication of CN109089124B publication Critical patent/CN109089124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/43Hardware specially adapted for motion estimation or compensation
    • H04N19/433Hardware specially adapted for motion estimation or compensation characterised by techniques for memory access

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a motion estimation-oriented interframe data reuse method and device. The motion estimation-oriented interframe data reuse method comprises the following steps: processing at least two current frames in turn in the same time period, and adopting the same starting point and scanning sequence; when any two adjacent frames are processed, the processing result data of the current frame is read to an on-chip cache in time for direct reading when the adjacent frames are processed, and the data of the frames of the current frame is reused; thus, when m current frames are processed in the same time period, only 1/m frames are read twice, and the rest frames only need to be read once; wherein m is more than or equal to 2 and m is a positive integer.

Description

Inter-frame data reuse method and device for motion estimation
Technical Field
The invention belongs to the field of data processing, and particularly relates to a motion estimation-oriented interframe data reuse method and device.
Background
The frame rate boost is used for optimizing the display effect of the dynamic images of the liquid crystal television. The frame rate lifting technology mainly solves the problems of obvious reduction of motion resolution, motion blur trailing and the like of a liquid crystal display panel due to response delay by improving the refreshing frequency of a television video image. Frame rate boosting is achieved by inserting one or more frames of images between two consecutive frames of images, and the frame insertion method based on motion estimation is one of the most effective methods.
However, the accuracy of motion estimation may have an impact on the quality of the inserted frame. Motion estimation is usually the most computation and memory part of frame rate lifting, and usually occupies most of the running time of frame rate lifting.
Motion estimation based on block matching is the mainstream motion estimation method because it is simple and efficient. Block matching is used to find the reference block that best matches the current macroblock. Among them, sad (sum of absolute differences) is a criterion for determining the best match. The displacement between the current macroblock and the reference macroblock is a Motion Vector (MV).
Full search motion estimation (FSIME) employs a brute force search to search for the best matching macroblock in the search window to the best accuracy. Due to its regularity, FSIME is suitable for hardware implementation, but FSIME requires a large amount of computation and memory access.
In addition, some fast search algorithms are proposed to reduce the time overhead, but there is usually some loss of accuracy, such as: three-step search, new three-step search, diamond search and four-step search. Fast search algorithms may not find the best matching macroblock and some fast search algorithms are not suitable for hardware implementation.
In recent years, the difference between the memory access speed and the computing speed of a processor is increasing, the performance of real-time video application is improved by reducing off-chip memory access as an important means, and the data reuse on-chip is an effective means for reducing the off-chip memory access. For FSIME, some data reuse methods have been proposed. But these efforts focus on how to improve data reuse efficiency within reference frames, while ignoring frame-to-frame data reuse. For frame frequency promotion, in a traditional memory access design, each frame of image is read twice from an off-chip memory, the first time is used as a current frame, and the second time is used as a previous frame, so that the video processing speed is reduced.
In summary, in the prior art, an effective solution is not yet provided for the problem that each frame of image in the conventional memory access design needs to be read twice from the off-chip memory.
Disclosure of Invention
In order to solve the problem that each frame of image needs to be read twice from an off-chip memory in the traditional memory access design, the first purpose of the invention is to provide a motion estimation-oriented inter-frame data reuse method, which can reduce the memory access times and improve the video processing speed through data reuse between adjacent frame images.
The invention relates to a motion estimation-oriented interframe data reuse method, which comprises the following steps:
processing at least two current frames in turn in the same time period, and adopting the same starting point and scanning sequence;
when any two adjacent frames are processed, the processing result data of the current frame is read to an on-chip cache in time for direct reading when the adjacent frames are processed, and the data of the frames of the current frame is reused; thus, when m current frames are processed in the same time period, only 1/m frames are read twice, and the rest frames only need to be read once; wherein m is more than or equal to 2 and m is a positive integer.
Further, the process of processing the current frame includes: according to the pre-divided macro blocks, the absolute error sum and the motion vector of the corresponding macro block are calculated.
Where the sum of absolute errors is one criterion for determining the reference block that best matches the current macroblock, is a measure of the similarity between image blocks. By taking the absolute value of the difference between each pixel in the original block and the corresponding pixel in the block for comparison. These differences are added to create a simple measure of block similarity, the L1 norm of the difference image or the manhattan distance between two image blocks.
The motion vector refers to a displacement between the current macroblock and the reference macroblock.
Further, the inter-frame data reuse efficiency increases as the number of current frames processed within the same time period increases.
A second object of the present invention is to provide a motion estimation-oriented inter-frame data reuse apparatus, which can reduce the number of accesses and memories and increase the video processing speed through data reuse between adjacent frame images.
The invention relates to a motion estimation-oriented interframe data reuse device, which comprises:
an off-chip storage for storing frame sequential image data;
an on-chip cache for storing reusable data;
a processor configured to perform the steps of:
processing at least two current frames in turn in the same time period, and adopting the same starting point and scanning sequence;
when any two adjacent frames are processed, the processing result data of the current frame is read to an on-chip cache in time for direct reading when the adjacent frames are processed, and the data of the frames of the current frame is reused; thus, when m current frames are processed in the same time period, only 1/m frames are read twice, and the rest frames only need to be read once; wherein m is more than or equal to 2 and m is a positive integer.
Further, the processor is further configured to:
according to the pre-divided macro blocks, the absolute error sum and the motion vector of the corresponding macro block are calculated.
Furthermore, the inter-frame data reuse device for motion estimation is a unidirectional frame frequency lifting motion estimation architecture, and when any two adjacent frames are processed, a current macro block cache of a current frame and two search window caches respectively used for storing image data of the two adjacent frames are allocated in the on-chip cache.
Furthermore, the inter-frame data reuse device for motion estimation is a bidirectional frame frequency lifting motion estimation architecture, and when any two adjacent frames are processed, three search window caches are allocated in the on-chip cache and are respectively used for storing two adjacent frame image data and the current frame image data.
Further, the processor comprises a plurality of processing unit arrays working in parallel.
Further, the processor is a GPU processor.
Further, the on-chip cache is a shared memory of the GPU processor.
Compared with the prior art, the invention has the beneficial effects that:
(1) the invention processes at least two current frames in turn in the same time period, and adopts the same starting point and scanning sequence, and when processing any two adjacent frames, the processing result data of the current frame is read to the on-chip cache in time for direct reading when processing the adjacent frames, thereby realizing the reuse of the data between frames of the current frame; therefore, the memory access times are reduced, the memory access time and the memory access bandwidth requirements are reduced, and the running speed of motion estimation and related video applications is improved.
(2) The invention does not need to store the whole frame image on the chip, only needs to store a plurality of search window caches on the chip, reduces the requirement of the data reuse technology on the size of the on-chip memory, and improves the chip layout of hardware.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.
FIG. 1 is a diagram of an Inter-C architecture on FRUC-UME in accordance with the present invention;
FIG. 2 is a sequence diagram of PEA processing a current CB in an Inter-C architecture over FRUC-UME in accordance with the present invention;
FIG. 3 is an Inter-C architecture diagram on a FRUC-BME according to the present invention;
fig. 4 is a graph of read times for each frame in a sequence of frames over an Inter-C architecture in accordance with the present invention.
Detailed Description
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
Interpretation of terms:
FRUC-ME: motion estimation in Frame Rate Up Conversion, a Motion estimation algorithm for Frame Rate Up.
FRUC-UME: unidirectional FRUC-ME, one-way frame rate up motion estimation.
FRUC-BME: bidirectional FRUC-ME, Bidirectional frame rate boost motion estimation.
PEA: processing Element Array, Processing Element Array.
Inter-C: and C-level inter-frame data reuse.
SR: search Range, Search window.
CB: current Block, Current macroblock.
The invention discloses a motion estimation-oriented interframe data reuse method, which has the following principle:
processing at least two current frames in turn in the same time period, and adopting the same starting point and scanning sequence;
when any two adjacent frames are processed, the processing result data of the current frame is read to an on-chip cache in time for direct reading when the adjacent frames are processed, and the data of the frames of the current frame is reused; thus, when m current frames are processed in the same time period, only 1/m frames are read twice, and the rest frames only need to be read once; wherein m is more than or equal to 2 and m is a positive integer.
The specific implementation is exemplified by two typical FRUC-MEs:
for the proposed method Inter-C, it is not necessary to store the whole frame of image on-chip, only multiple SRs need to be stored on-chip at the same time. One SR cache is used to store one SR.
Reference is made to fig. 1 for an Inter-C architecture diagram over FRUC-UME in accordance with the present invention.
As shown in fig. 1, on the FRUC-UME Inter-C architecture, there are two SR caches and one CB cache on the chip, where the two SR caches refer to caches on two frames, Frame i and Frame i-1, that is, each SR cache is used to store an SR of a reference Frame.
CB buffer refers to a buffer on Frame i +1 Frame. Both the search window width (SRH) and the search window height (SRV) are equal to twice the macroblock size (2N).
The PEA calculates a SAD value and a motion vector for the current block. PEA processes CBs of two current frames (Frame i and Frame i +1) in turn in the framework. Frame i is the current Frame of Frame i-1 and is also the reference Frame of Frame i + 1.
The object of the present embodiment is to change the number of times of reading Frame i from two to one. The CB of the Frame i is contained in the SR cache of the Frame i. Therefore, the CBs of Frame i and Frame i +1 are processed by turns using PEA, and the same starting point and scanning order are used.
Wherein i is more than or equal to 0 and less than or equal to m-1; m is more than or equal to 2 and m is a positive integer.
Referring to fig. 2, a sequence diagram of PEA processing a current CB in an Inter-C architecture over FRUC-UME in accordance with the present invention is shown.
As shown in fig. 2, PEA, after processing CB0 of Frame i +1 in Step0, continues processing CB0 of Frame i in Step 1. CB0 for Frame i is now already in SR buffer (read to the chip in Step 0), so it can be reused in Step 1.
PEA then proceeds to process CB1 for Frame i +1 at Step 2.
Processing of CB1 for Frame i continues in Step 3. CB1 for Frame i is also already in SR buffer (read to the chip in Step 2) at this time, so it can be reused again in Step 3.
In this way, the CB of Frame i is always read to the on-chip buffer in time, and does not need to be read from the off-chip again, so that the "inter-Frame" data reuse of Frame i is realized in the SR buffer.
However, in this way, the frames i-1 and i +1 still need to be read onto the sheet twice.
By analogy, each m frame of image needs to be read twice from the outside of the chip, and other images need to be read only once, so that the Inter-frame data reuse efficiency Ra of FRUC-UME Inter-C can be calculated as follows:
Figure BDA0001786754590000051
wherein SRV refers to the height of a search window, W and H refer to the width and height of a frame respectively, N refers to the number of macro blocks, m refers to the number of current frames processed simultaneously, m is more than or equal to 2, and m is a positive integer.
Reference is made to fig. 3 for an Inter-C architecture diagram on a FRUC-BME in accordance with the present invention.
As shown in fig. 3, the difference from FRUC-UME is that FRUC-BME replaces the CB buffer of Frame i +1 with one SR buffer because each inserted macroblock requires two adjacent search windows. FRUC-BME the order of CB processing is the same as FRUC-UME.
The calculation mode of the interframe data reuse efficiency Ra of the FRUC-BME Inter-C is similar to the calculation mode of the Ra of the FRUC-UME Inter-C:
Figure BDA0001786754590000061
wherein, SRV refers to a vertical search range, W and H refer to the width and height of a frame respectively, N refers to the number of macro blocks, m refers to the number of current frames processed simultaneously, m is more than or equal to 2, and m is a positive integer.
Referring to fig. 4, a graph of read times for each frame in a sequence of frames over an Inter-C architecture in accordance with the present invention is shown.
As shown in fig. 4, the number of current frames processed in the same time period can be increased by increasing the number of SR buffers, thereby increasing the efficiency of data reuse.
If only one current frame is processed in the same time period, only data reuse in the frame can be realized, and each frame needs to be read onto the chip twice.
If two current frames are processed simultaneously in the same time period, data reuse between frames can be realized, and the 1/2 frames only need to be transmitted once.
If three frames are processed simultaneously in the same time period, the 1/3 frames need only be read twice.
The data reuse efficiency will increase with the increase of the number of current frames processed in the same time period, when m current frames are processed simultaneously, only 1/m frame is read twice, and the rest frames only need to be read once.
The invention also provides a device for reusing the interframe data facing the motion estimation.
The invention relates to a motion estimation-oriented interframe data reuse device, which comprises:
an off-chip storage for storing frame sequential image data;
an on-chip cache for storing reusable data;
a processor configured to perform the steps of:
processing at least two current frames in turn in the same time period, and adopting the same starting point and scanning sequence;
when any two adjacent frames are processed, the processing result data of the current frame is read to an on-chip cache in time for direct reading when the adjacent frames are processed, and the data of the frames of the current frame is reused; thus, when m current frames are processed in the same time period, only 1/m frames are read twice, and the rest frames only need to be read once; wherein m is more than or equal to 2 and m is a positive integer.
In a specific implementation, the processor is further configured to:
according to the pre-divided macro blocks, the absolute error sum and the motion vector of the corresponding macro block are calculated.
In an embodiment, the inter-frame data reuse apparatus for motion estimation is a unidirectional frame rate up-conversion motion estimation architecture, and when any two adjacent frames are processed, a current macroblock buffer of a current frame and two search window buffers respectively used for storing image data of the two adjacent frames are allocated in an on-chip buffer.
In another embodiment, the motion estimation-oriented inter-frame data reuse apparatus is a bidirectional frame rate up-scaling motion estimation architecture, and when any two adjacent frames are processed, three search window buffers are allocated in the on-chip buffer for two adjacent frame image data and the current frame image data, respectively.
In a specific implementation, the processor comprises a plurality of processing unit arrays working in parallel. Parallelism can be increased by increasing the number of PEAs.
In a specific implementation, besides using PEA to realize data reuse, shared memory in gpu (graphics processing unit) may be used to realize inter-frame data reuse.
The processor is a GPU processor; the on-chip cache is a shared memory of the GPU processor.
In addition, the Inter-C can also utilize the data reuse between adjacent SRs in the previous frame at the same time, i.e. the Inter-C is compatible with the Intra-C.
The invention processes at least two current frames in turn in the same time period, and adopts the same starting point and scanning sequence, and when processing any two adjacent frames, the processing result data of the current frame is read to the on-chip cache in time for direct reading when processing the adjacent frames, thereby realizing the reuse of the data between frames of the current frame; therefore, the memory access times are reduced, and the memory access time is reduced; the requirement on the memory access bandwidth is reduced, and the running speed of motion estimation and related video applications is increased.
The invention does not need to store the whole frame image on the chip, only needs to store a plurality of search window caches on the chip, reduces the requirement of the data reuse technology on the size of the on-chip memory, and improves the chip layout of hardware.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims (7)

1.一种面向运动估计的帧间数据重用方法,其特征在于,包括:1. a motion estimation-oriented inter-frame data reuse method, is characterized in that, comprising: 在同一时间段内轮流处理至少两个当前帧,且采用相同的起始点和扫描顺序;Process at least two current frames in turn in the same time period, and use the same starting point and scanning order; 在处理任意两个相邻帧时,当前帧的处理结果数据及时被读取至片上缓存,以备在处理相邻帧时直接读取,实现对当前帧的帧间数据重用;这样当在同一时间段内处理m个当前帧,只有1/m的帧被读取两次,其余各帧只需要读取一次;其中,m≥2且m为正整数;When processing any two adjacent frames, the processing result data of the current frame is read to the on-chip cache in time for direct reading when processing adjacent frames, realizing the reuse of the inter-frame data of the current frame; When m current frames are processed in the time period, only 1/m frames are read twice, and the rest of the frames only need to be read once; where m≥2 and m is a positive integer; 所述面向运动估计的帧间数据重用方法在FRUC-UME上的Inter-C架构上执行;The described motion estimation-oriented inter-frame data reuse method is performed on the Inter-C architecture on FRUC-UME; 在处理任意两个相邻帧时,片上缓存中分配有一个当前帧的当前宏块缓存以及分别用于存储两个相邻帧图像数据的两个搜索窗缓存;When processing any two adjacent frames, the on-chip buffer is allocated with a current macroblock buffer of the current frame and two search window buffers respectively used to store the image data of the two adjacent frames; 不需要在片上存储整帧图像,仅需要在片上存储多个搜索窗缓存,降低了数据重用技术对片上内存大小的要求;It is not necessary to store the whole frame of image on-chip, only multiple search window caches need to be stored on-chip, which reduces the requirement of data reuse technology on the size of on-chip memory; 处理当前帧的过程包括:根据预先划分的宏块,计算相应宏块的绝对差总和值和运动矢量。The process of processing the current frame includes: calculating the absolute difference sum value and the motion vector of the corresponding macroblocks according to the pre-divided macroblocks. 2.如权利要求1所述的一种面向运动估计的帧间数据重用方法,其特征在于,帧间数据重用效率随着同一时间段内处理的当前帧的数量增长而增长。2 . The method for reusing data between frames for motion estimation according to claim 1 , wherein the efficiency of reusing data between frames increases as the number of current frames processed in the same time period increases. 3 . 3.一种面向运动估计的帧间数据重用装置,其特征在于,包括:3. A motion estimation-oriented inter-frame data reuse device is characterized in that, comprising: 片外存储,其用于存储帧序列图像数据;Off-chip storage, which is used to store frame sequence image data; 片上缓存,其用于存储可重用数据;On-chip cache, which is used to store reusable data; 处理器,其被配置为执行以下步骤:a processor configured to perform the following steps: 在同一时间段内轮流处理至少两个当前帧,且采用相同的起始点和扫描顺序;Process at least two current frames in turn in the same time period, and use the same starting point and scanning order; 在处理任意两个相邻帧时,当前帧的处理结果数据及时被读取至片上缓存,以备在处理相邻帧时直接读取,实现对当前帧的帧间数据重用;这样当在同一时间段内处理m个当前帧,只有1/m的帧被读取两次,其余各帧只需要读取一次;其中,m≥2且m为正整数;When processing any two adjacent frames, the processing result data of the current frame is read to the on-chip cache in time for direct reading when processing adjacent frames, realizing the reuse of the inter-frame data of the current frame; When m current frames are processed in the time period, only 1/m frames are read twice, and the rest of the frames only need to be read once; where m≥2 and m is a positive integer; 所述处理器,还被配置为:The processor is also configured to: 根据预先划分的宏块,计算相应宏块的绝对差总和值和运动矢量;Calculate the absolute difference sum value and motion vector of the corresponding macroblock according to the pre-divided macroblock; 所述面向运动估计的帧间数据重用装置为单向帧频提升运动估计架构,在处理任意两个相邻帧时,片上缓存中分配有一个当前帧的当前宏块缓存以及分别用于存储两个相邻帧图像数据的两个搜索窗缓存;The motion estimation-oriented inter-frame data reuse device is a one-way frame rate boosting motion estimation architecture. When processing any two adjacent frames, the on-chip buffer is allocated a current macroblock buffer of the current frame and a buffer for storing two current macroblocks respectively. two search window buffers for adjacent frame image data; 不需要在片上存储整帧图像,仅需要在片上存储多个搜索窗缓存,降低了数据重用技术对片上内存大小的要求;It is not necessary to store the whole frame of image on-chip, only multiple search window caches need to be stored on-chip, which reduces the requirement of data reuse technology on the size of on-chip memory; 处理当前帧的过程包括:根据预先划分的宏块,计算相应宏块的绝对差总和值和运动矢量。The process of processing the current frame includes: calculating the absolute difference sum value and the motion vector of the corresponding macroblocks according to the pre-divided macroblocks. 4.如权利要求3所述的一种面向运动估计的帧间数据重用装置,其特征在于,所述面向运动估计的帧间数据重用装置为双向帧频提升运动估计架构,在处理任意两个相邻帧时,片上缓存中分配有三个搜索窗缓存,分别用来两个相邻帧图像数据以及当前帧图像数据。4. A motion estimation-oriented inter-frame data reuse device according to claim 3, characterized in that, the motion estimation-oriented inter-frame data reuse device is a bidirectional frame rate boosting motion estimation architecture. When there are adjacent frames, three search window buffers are allocated in the on-chip buffer, which are respectively used for the image data of two adjacent frames and the image data of the current frame. 5.如权利要求3所述的一种面向运动估计的帧间数据重用装置,其特征在于,所述处理器包括若干个并行工作的处理单元阵列。5 . The apparatus for reusing data between frames for motion estimation according to claim 3 , wherein the processor comprises a plurality of processing unit arrays working in parallel. 6 . 6.如权利要求3所述的一种面向运动估计的帧间数据重用装置,其特征在于,所述处理器为GPU处理器。6 . The apparatus for reusing data between frames for motion estimation according to claim 3 , wherein the processor is a GPU processor. 7 . 7.如权利要求6所述的一种面向运动估计的帧间数据重用装置,其特征在于,所述片上缓存为GPU处理器的共享内存。7 . The apparatus for reusing data between frames for motion estimation according to claim 6 , wherein the on-chip cache is a shared memory of a GPU processor. 8 .
CN201811018540.7A 2018-09-03 2018-09-03 A method and device for reusing inter-frame data for motion estimation Active CN109089124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811018540.7A CN109089124B (en) 2018-09-03 2018-09-03 A method and device for reusing inter-frame data for motion estimation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811018540.7A CN109089124B (en) 2018-09-03 2018-09-03 A method and device for reusing inter-frame data for motion estimation

Publications (2)

Publication Number Publication Date
CN109089124A CN109089124A (en) 2018-12-25
CN109089124B true CN109089124B (en) 2021-10-19

Family

ID=64840555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811018540.7A Active CN109089124B (en) 2018-09-03 2018-09-03 A method and device for reusing inter-frame data for motion estimation

Country Status (1)

Country Link
CN (1) CN109089124B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107615763A (en) * 2015-05-28 2018-01-19 寰发股份有限公司 Method and apparatus for using current image as reference image
CN107925769A (en) * 2015-09-08 2018-04-17 联发科技股份有限公司 Method and system for a decoded image buffer for intra block copy mode

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018148486A (en) * 2017-03-08 2018-09-20 キヤノン株式会社 Image encoding apparatus, image encoding method and program

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107615763A (en) * 2015-05-28 2018-01-19 寰发股份有限公司 Method and apparatus for using current image as reference image
CN107925769A (en) * 2015-09-08 2018-04-17 联发科技股份有限公司 Method and system for a decoded image buffer for intra block copy mode

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"A Novel Data Reuse Method to Reduce Demand on Memory Bandwidth and Power Consumption For True Motion Estimation";Weizhi Xu;《IEEE Access volume 6》;20180219;pages10151-10159 *

Also Published As

Publication number Publication date
CN109089124A (en) 2018-12-25

Similar Documents

Publication Publication Date Title
KR100952861B1 (en) Digital video data processing
US11223838B2 (en) AI-assisted programmable hardware video codec
US8265160B2 (en) Parallel three-dimensional recursive search (3DRS) meandering algorithm
US20070071101A1 (en) Systolic-array based systems and methods for performing block matching in motion compensation
US9262839B2 (en) Image processing device and image processing method
US20160080768A1 (en) Encoding system using motion estimation and encoding method using motion estimation
US8345764B2 (en) Motion estimation device having motion estimation processing elements with adder tree arrays
EP3823282A1 (en) Video encoding method and device, and computer readable storage medium
US9460489B2 (en) Image processing apparatus and image processing method for performing pixel alignment
US20150181209A1 (en) Modular motion estimation and mode decision engine
CN103634604B (en) Multi-core DSP (digital signal processor) motion estimation-oriented data prefetching method
US6501799B1 (en) Dual-prime motion estimation engine
CN109089124B (en) A method and device for reusing inter-frame data for motion estimation
CN100496126C (en) Image coding device and method thereof
Moshnyaga A new computationally adaptive formulation of block-matching motion estimation
JP2009015637A (en) Computational unit and image filtering apparatus
WO2022110131A1 (en) Inter-frame prediction method and apparatus, and encoder, decoder and storage medium
JP2014078891A (en) Image processing apparatus and image processing method
WO2011001364A1 (en) Parallel three-dimensional recursive search (3drs) meandering algorithm
US20070153909A1 (en) Apparatus for image encoding and method thereof
CN114449294A (en) Motion estimation method, apparatus, device, storage medium and computer program product
CN109427071B (en) Full search block matching method and device
KR100571907B1 (en) Determination of the Number of Processing Elements in a Video Estimation Algorithm
CN101459761A (en) Image processing method and related device thereof
JP4519009B2 (en) SAD operation device and macroblock search device using the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant