[go: up one dir, main page]

WO2015180054A1 - 一种基于图像超分辨率的视频编解码方法及装置 - Google Patents

一种基于图像超分辨率的视频编解码方法及装置 Download PDF

Info

Publication number
WO2015180054A1
WO2015180054A1 PCT/CN2014/078613 CN2014078613W WO2015180054A1 WO 2015180054 A1 WO2015180054 A1 WO 2015180054A1 CN 2014078613 W CN2014078613 W CN 2014078613W WO 2015180054 A1 WO2015180054 A1 WO 2015180054A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
resolution
block
super
dictionary
Prior art date
Application number
PCT/CN2014/078613
Other languages
English (en)
French (fr)
Inventor
王荣刚
赵洋
王振宇
高文
王文敏
董胜富
黄铁军
马思伟
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Priority to PCT/CN2014/078613 priority Critical patent/WO2015180054A1/zh
Publication of WO2015180054A1 publication Critical patent/WO2015180054A1/zh
Priority to US15/060,627 priority patent/US9986255B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • H04N19/52Processing of motion vectors by encoding by predictive encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/53Multi-resolution motion estimation; Hierarchical motion estimation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/537Motion estimation other than block-based
    • H04N19/543Motion estimation other than block-based using regions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Definitions

  • the present invention relates to the field of image super-resolution technology, and in particular to a video encoding and decoding method and apparatus based on image super-resolution.
  • the traditional coding method is to compress the video image by using the information to be encoded and the information of the video itself.
  • the redundancy of the video coding is continuously reduced, and the temporal and spatial domains of the image to be encoded and the video itself are continuously reduced. Relevance is fully utilized.
  • the image to be encoded and the video information are predicted, thereby reducing the amount of information of the image to be encoded and the video itself, which is a new direction for greatly improving the image and video compression efficiency.
  • the image super-resolution-based video encoding method includes: performing super-resolution interpolation processing on the encoded video image by using a pre-trained texture dictionary library to obtain a reference image, where the texture dictionary library includes: a plurality of sets of dictionary bases, the dictionary base comprising: a high resolution image block of the training image and a mapping group combined with the low resolution image blocks corresponding to the high resolution image block, the super resolution interpolation processing comprising: And performing image estimation and motion compensation on the reference image, and obtaining a prediction block corresponding to each image block of the to-be-encoded video image; And subtracting the image block of the video image to be encoded from the corresponding prediction block to obtain a prediction residual block; and performing coding processing on the prediction residual block.
  • the texture dictionary library includes: a plurality of sets of dictionary bases, the dictionary base comprising: a high resolution image block of the training image and a mapping group combined with the low resolution image blocks corresponding to the high resolution image block, the super resolution interpolation processing comprising: And
  • the image super-resolution-based video decoding method provided by the embodiment of the present invention includes: decoding the acquired image encoded stream signal to obtain a prediction residual block; and using the pre-trained texture dictionary library to perform super-resolution interpolation processing on the decoded video image.
  • the texture dictionary library comprising: one or more sets of dictionary bases
  • the dictionary base is: a high resolution image block of the training image and a low resolution image block corresponding to the high resolution image block a composite mapping group
  • the super-resolution interpolation processing includes: performing image enlargement and image detail information recovery; performing motion compensation on each of the image blocks on the to-be-decoded video image on the reference image, and obtaining a prediction block corresponding to each image block; adding the prediction block to the prediction residual block to obtain a decoded video image.
  • the image super-resolution based video encoding apparatus includes: a resolution interpolation processing unit, configured to perform super-resolution interpolation processing on the video image to be encoded by using the pre-trained texture dictionary library, to obtain a reference image, where the texture dictionary library includes: one or more sets of dictionary bases, and the dictionary base includes : a high-resolution image block of the training image and a mapping group combined with the low-resolution image block corresponding to the high-resolution image block, the super-resolution interpolation processing comprising: performing image enlargement and image detail information recovery; a prediction unit, configured to perform motion estimation and motion compensation on a reference image on each image block on the image to be encoded, to obtain a prediction block corresponding to each image block of the video image to be encoded; and a subtraction calculation unit, configured to Subdividing each image block of the video image to be encoded with a corresponding prediction block estimated by the motion estimation unit to obtain a prediction residual block; a coding unit, configured to calculate the
  • the image super-resolution-based video decoding apparatus includes: a decoding unit, configured to decode the acquired image encoded stream signal to obtain a prediction residual block; and a super-resolution interpolation processing unit, configured to use pre-training
  • the texture dictionary library performs super-resolution interpolation processing on the decoded video image to obtain a reference image
  • the texture dictionary library includes: one or more sets of dictionary bases, the dictionary base comprising: a high-resolution image block of the training image and a low-resolution image block corresponding to the high-resolution image block
  • the super-resolution interpolation process includes: performing image enlargement and image detail information recovery; and a prediction unit, configured to perform image on each of the to-be-decoded video images Performing motion compensation on the reference image to obtain a prediction block corresponding to each image block; an addition calculation unit, configured to obtain the prediction block obtained by the motion compensation processing unit and the decoding unit
  • the prediction residual blocks are added to obtain a decoded video image.
  • the image super-resolution-based video encoding apparatus includes: a data input unit for inputting data; a data output unit for outputting data; and a storage unit for storing data, including an executable program And a processor, connected to the data input unit, the data output unit, and the storage unit, for executing the executable program, and the executing of the program includes completing the foregoing method.
  • the image super-resolution-based video decoding device includes: a data input unit for inputting data; a data output unit for outputting data; and a storage unit for storing data, including an executable program And a processor, connected to the data input unit, the data output unit, and the storage unit, for executing the executable program, and the executing of the program includes completing the foregoing method.
  • the embodiments of the present invention have the following advantages:
  • the image super-resolution-based video encoding and decoding method and apparatus provided by the present application, the method of the present application performs super-resolution interpolation processing on the encoded/to-decoded video image before performing prediction on the video image to be encoded and to be encoded, and can treat The coded/to-be-decoded image is amplified and the detail information is restored, so that when the image to be encoded/to-be-decoded is predicted to obtain a prediction block, the method of predicting the video image by linear interpolation is more effective than the prior art.
  • the original image avoids the problem of predicting block edge blur in the prior art, thereby improving the accuracy of video image prediction, thereby improving the coding efficiency of the video image.
  • FIG. 1 is a flowchart of an image super-resolution-based video encoding method according to Embodiment 1;
  • FIG. 2a-2c are schematic diagrams showing feature extraction of an image block partial texture structure according to an embodiment of the present application;
  • step 101 of the second embodiment is a flow chart of an embodiment of step 101 of the second embodiment
  • FIG. 4 is a flowchart of an image super-resolution-based video decoding method according to Embodiment 3;
  • FIG. 5 is a flowchart of an embodiment of step 202 of Embodiment 3;
  • FIG. 6 is a schematic structural diagram of a device according to Embodiment 5 of the present application.
  • FIG. 7 is a schematic structural diagram of a super-resolution interpolation processing unit according to Embodiment 5 of the present application
  • FIG. 8 is a schematic structural diagram of a device according to Embodiment 6 of the present application
  • FIG. 9 is a schematic structural diagram of a super-resolution interpolation processing unit according to Embodiment 6 of the present application.
  • FIG. 10 is a schematic structural diagram of a device according to Embodiment 7 of the present application;
  • FIG. 11 is a schematic structural diagram of an apparatus according to Embodiment 8 of the present application.
  • a video encoding and decoding method and device based on image super resolution are provided, which can recover high frequency information of an image and improve image quality, thereby being applied to time domain prediction of a video image, and improving prediction. Accuracy, which in turn improves codec efficiency.
  • Embodiment 1 is a diagrammatic representation of Embodiment 1:
  • FIG. 1 is a flow chart of a video encoding method based on image super resolution in an embodiment. As shown in FIG. 1 , this embodiment provides a video encoding method based on image super resolution, which may include the following steps:
  • the texture dictionary library includes: one or more sets of dictionary bases, the dictionary base is: a high-resolution image block of a training image and a mapping group of low-resolution image block combinations corresponding to the high-resolution image block,
  • the super-resolution interpolation process includes: performing image enlargement and image detail information, ⁇ , recovery.
  • Each image block on the video image to be encoded performs motion estimation and motion compensation on the reference image to obtain a prediction block corresponding to each image block.
  • the image block may be divided on the video image according to a preset division rule, for example: dividing 2 ⁇ 2 pixels into one image block.
  • a preset division rule for example: dividing 2 ⁇ 2 pixels into one image block.
  • This application does not specifically limit the division rules by way of example.
  • the motion estimation and the motion compensation may be performed in the reference image in each of the divided image blocks on the encoded video image, and the position offset and the corresponding pixel value of each image block in the reference frame are calculated, thereby obtaining a The motion-estimated prediction block corresponding to each image block of the video image to be encoded.
  • the image super-resolution based video encoding method provided in the first embodiment of the present application uses a pre-trained texture dictionary library to perform super-resolution interpolation processing on the encoded video image, and can enlarge the encoded image and recover the detailed information, and then perform the
  • the reference image block after the super-resolution interpolation processing performs motion estimation to obtain a corresponding prediction block, and then subtracts the prediction block from the video image to be encoded to obtain a residual block, and then encodes the residual block.
  • the method of the present application performs super-resolution interpolation processing on the encoded video image before performing prediction on the video image to be encoded, and can enlarge and perform details on the encoded image. The information is recovered. In this way, when the motion estimation process is performed on the coded image to obtain the prediction block, the problem of predicting block edge blur in the prior art is avoided, thereby improving the accuracy of the prediction and improving the coding efficiency.
  • each dictionary base in the texture dictionary library is classified according to local features of high resolution image blocks of each training image and local features of low resolution image blocks corresponding to the high resolution image block.
  • the local features include a local binary structure (LBS, Local Binary Structure) and a sharp edge structure (SES, Sharp Edge Structure).
  • the texture dictionary is pre-trained, and the pre-training of the texture dictionary can take the following implementations:
  • A, B, C, and D are four adjacent pixel points.
  • the height of the pixel reflects the gray value of the pixel.
  • A, B, C, D Four pixels form a flat local area, so the gray values are equal in size.
  • the gray values of the pixels A and B are higher than the gray values of the pixels C and D.
  • This embodiment defines LBS-Geometry ( LBS — G ) to distinguish this geometric difference.
  • LBS-Geometry ( LBS — G ) is calculated as Equation ( 1 ): Wherein, gp represents the gray value of the local pth pixel point, and 8 relieve 1 ⁇ 11 is the local pixel mean of the four pixel points of eight, B, C, and D. In this embodiment, four pixel points are used. For example, in other embodiments, the number of pixels may be other values, such as N, and N is a positive integer.
  • LBS_D LBS-Difference
  • t is preset with a grayscale threshold, and in a specific embodiment, t is set to a relatively large threshold for distinguishing sharp edges.
  • the training of the texture dictionary may use the K-means clustering method to obtain an under-complete dictionary, or the training of the texture dictionary may use a sparse coding method to obtain a training dictionary using K-means clustering.
  • a certain number (for example, 100,000) of samples are selected from the feature samples, and a plurality of category centers are clustered by the K-means clustering algorithm, and a set of centers of these categories is used as a texture dictionary library. Training the dictionary using K-means clustering can create an under-complete dictionary library with low dimensionality.
  • the high resolution partial image block X unknown on the image can be represented as a combination of multiple dictionary bases in the texture dictionary library:
  • Dh(y) is a dictionary-based high-resolution dictionary sample with the same LBS and SES as y, where ⁇ is the expression coefficient.
  • the coefficient ⁇ satisfies the sparsity, and the low-resolution dictionary sample Dl(y) is used to calculate the sparse expression coefficient ⁇ , and then the calculated expression coefficient ⁇ is substituted into the equation (5) to calculate the corresponding high resolution.
  • the local image block X is rated, so the acquisition of the optimal ⁇ can be transformed into the following optimization problem: mm a st FD ⁇ -Fy ⁇
  • is a minimum value that tends to 0
  • F is a feature description sub-operation.
  • the feature is a local gray-scale difference combined with a gradient value. Since ⁇ is sufficiently sparse, the L1 norm is used instead of the L0 norm of equation (6), and the optimization problem becomes:
  • is a coefficient that adjusts sparsity and similarity
  • the optimal sparse expression coefficient ⁇ can be obtained by solving the above Lasso problem, and then substituting equation (5) to calculate the high-resolution partial image block X corresponding to y.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • FIG. 3 is a flowchart of an embodiment of step 101 in the first embodiment.
  • each dictionary base in the texture dictionary is classified according to local features of high-resolution image blocks of each training image and local features of low-resolution image blocks corresponding to the high-resolution image block, Local features include local binary structures and sharp edge structures.
  • the image super-resolution-based video coding method uses the pre-trained texture dictionary library to perform super-resolution interpolation processing on the encoded video image, which may specifically include the following steps:
  • the local features of each dictionary base in the dictionary library are paired to obtain the dictionary base of the pair.
  • Embodiment 3 is a diagrammatic representation of Embodiment 3
  • FIG. 4 is a flowchart of a video decoding method based on image super resolution in an embodiment.
  • the image decoding method based on image super resolution provided in this embodiment may include the following steps:
  • the texture dictionary library includes: one or more sets of dictionary bases
  • the dictionary base includes: high resolution of the training image
  • a mapping group formed by combining the image block and the low-resolution image block corresponding to the high-resolution image block, wherein the super-resolution interpolation process comprises: performing image enlargement and detail information recovery of the image.
  • the image super-resolution-based video decoding method decodes the obtained image encoded stream signal to obtain a prediction residual block, and uses the pre-trained texture dictionary library to perform super-resolution interpolation processing on the decoded video image.
  • the super-resolution interpolation processing includes: performing image enlargement and recovering detailed information of the image, performing motion compensation processing on the video image subjected to the interpolation processing, obtaining a prediction block, and adding the prediction block and the prediction residual block to obtain a video image to be decoded.
  • the method of the present application performs super-resolution interpolation processing on the decoded video image before performing prediction on the video image to be decoded, and can enlarge and perform details on the decoded image. The information is recovered. In this way, when the motion-compensation process is performed on the image to be decoded to obtain a prediction block, the problem of predicting the edge blur of the block in the prior art does not occur, thereby improving the accuracy of the prediction and improving the decoding efficiency.
  • Embodiment 4 is a diagrammatic representation of Embodiment 4:
  • FIG. 5 is a flowchart of an embodiment of step 202 in the third embodiment.
  • each dictionary base in the texture dictionary is classified according to local features of high-resolution image blocks of each training image and local features of low-resolution image blocks corresponding to the high-resolution image block, Local features include local binary structures and sharp edge structures.
  • Using the pre-trained texture dictionary library to perform super-resolution interpolation processing on the decoded video image may specifically include the following steps:
  • 202b Pair local features of each image block with local features of each dictionary base in the texture dictionary library to obtain a dictionary base of the pair.
  • 202c Perform detail information recovery and image enlargement processing on the decoded video image by using a pair of dictionary bases.
  • Embodiment 5 is a diagrammatic representation of Embodiment 5:
  • the embodiment provides a video encoding device based on image super resolution, which may include:
  • the super-resolution interpolation processing unit 60 is configured to perform super-resolution interpolation processing on the video image to be encoded by using the pre-trained texture dictionary library, where the texture dictionary library includes: one or more sets of dictionary bases, and the dictionary base includes: training a high-resolution image block of the image and a mapping group combined with the low-resolution image block corresponding to the high-resolution image block, the super-resolution interpolation processing comprising: performing image enlargement and image detail information recovery.
  • the prediction unit 61 is configured to perform motion estimation and motion compensation on the reference image after the super-resolution interpolation processing by the super-resolution interpolation processing unit 60 for each image block on the image to be encoded, to obtain each image of the video image to be encoded.
  • the prediction block corresponding to the block.
  • the subtraction calculation unit 62 is configured to subtract each image block of the video image to be encoded and the corresponding prediction block estimated by the motion estimation unit 61 to obtain a prediction residual block.
  • the coding unit 63 is configured to perform coding processing on the prediction residual block calculated by the subtraction calculation unit 62.
  • each dictionary base in the texture dictionary is in accordance with a local feature of a high resolution image block of each training image and a low resolution image block corresponding to the high resolution image block.
  • the local features are classified, the local features including a local binary structure and a sharp edge structure.
  • FIG. 7 is a schematic structural diagram of a super-resolution interpolation processing unit according to Embodiment 5 of the present application.
  • the super-resolution interpolation processing unit 60 may specifically include:
  • the extracting module 601 is configured to extract local features of each image block on the video image to be encoded.
  • the pairing module 602 is configured to compare the local features of each image block in the to-be-encoded video image extracted by the extraction module 601 with the local features of each dictionary base in the texture dictionary library to obtain a dictionary base of the pair.
  • the image processing module 603 is configured to perform image detail information recovery and image enlargement processing on the corresponding image block on the to-be-encoded video image by using the pair of pairs of the pair of pairs of the pair of encoded video images.
  • the embodiment provides a video decoding device based on image super resolution, which may include:
  • the decoding unit 70 is configured to decode the acquired image encoded stream signal to obtain a prediction residual block.
  • the super-resolution interpolation processing unit 71 is configured to perform super-resolution interpolation processing on the video image to be decoded by using the pre-trained texture dictionary library to obtain a reference image, where the texture dictionary library includes: One or more sets of dictionary bases, the dictionary base comprising: a high resolution image block of the training image and a mapping group combined with the low resolution image blocks corresponding to the high resolution image block, the super resolution interpolation Processing includes: performing image enlargement and image detail recovery.
  • the prediction unit 72 performs motion compensation on the reference image images subjected to the interpolation processing by the super-resolution interpolation processing unit 71 for each image block on the video image to be decoded, and obtains prediction blocks corresponding to the respective image blocks.
  • the addition calculation unit 73 is configured to add the prediction block obtained by the motion compensation processing unit 72 and the prediction residual block obtained by the decoding unit to obtain a video image to be decoded.
  • FIG. 9 is a schematic structural diagram of a super-resolution interpolation processing unit.
  • Each dictionary base in the texture dictionary is classified according to a local feature of a high resolution image block of each training image and a local feature of a low resolution image block corresponding to the high resolution image block, the local feature including a local second Value structure and sharp edge structure.
  • the super-resolution interpolation processing unit 71 includes:
  • the extracting module 710 is configured to extract local features of the video image to be decoded.
  • the pairing module 711 is configured to compare the local features of each image block in the to-be-decoded video image extracted by the extraction module 710 with the local features of each dictionary base in the texture dictionary library to obtain a dictionary base of the pair.
  • the image processing module 712 performs the detail information recovery and image enlargement processing on the decoded video image of the pair of pairs of the pair of pairs that are paired with the module 711.
  • the embodiment provides a video encoding system based on image super resolution, which may include:
  • a data input unit 80 for inputting data
  • a data output unit 81 for outputting data
  • a storage unit 82 for storing data, including an executable program
  • a processor 83 for a data input unit 80, and a data output unit 81
  • the storage unit data 82 is connected for performing all or part of the steps.
  • the embodiment provides a video decoding system based on image super resolution, which may include:
  • a data input unit 90 for inputting data
  • a data output unit 91 for outputting data
  • a storage unit 92 for storing data, including an executable program, a processor 93, a data input unit 90, and a data output unit 91.
  • the storage unit data 92 is connected for performing all or part of the steps. , - ⁇ '
  • the steps may be completed by a program to instruct related hardware, and the program may be stored in a computer readable storage medium, and the storage medium may include: a read only memory, a random access memory, a magnetic disk or an optical disk, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供的基于图像超分辨率的视频编解码方法及装置,本申请方法在对待编码和待编码的视频图像进行预测前,先对待编码/待解码视频图像进行超分辨率插值处理,可以对待编码/待解码图像进行放大及进行细节信息恢复,从而,在对待编码/待解码图像进行预测得到预测块时,相比现有技术利用线性插值对视频图像进行预测的方法,更能有效还原原图像,避免出现现有技术中预测块边缘模糊的问题,从而提升视频图像预测的准确性,进而提升视频图像的编码效率。

Description

一种基于图像超分辨率的视频编解码方法及装置 技术领域
本发明涉及图像超分辨率技术领域, 具体涉及一种基于图像超分辨 率的视频编解码方法及装置。
背景技术
传统的编码方法是利用待编码图像与视频自身的信息冗余对视频图 像进行压缩处理的, 随着编码技术的不断进步, 视频编码的冗余度不断 降低, 待编码图像与视频自身的时空域相关性被充分利用。 借助待编码 图像与视频以外的信息, 对待编码图像和视频信息进行预测, 从而降低 待编码图像与视频本身的信息量, 是大幅度提高图像与视频压缩效率的 新方向。
现有技术中, 为了提高视频图像帧间预测效率, 分像素运动补偿技 术普遍被釆用。 为了获得分像素信息, 目前普遍釆用线性插值的方法。 线性插值方法的优点是简单, 但缺点是难以恢复高分辨率图像的高频细 节信息, 同时对于边缘部分造成模糊, 从而制约了分像素运动补偿的效 率。
发明内容
本发明实施例提供的基于图像超分辨率的视频编码方法, 包括: 利 用预训练的紋理字典库对待编码视频图像进行超分辨率插值处理, 得到 参考图像, 所述紋理字典库包括: 一组或多组字典基, 所述字典基包括: 训练图像的高分辨率图像块以及与所述高分辨率图像块对应的低分辨率 图像块组合成的映射组, 所述超分辨率插值处理包括: 进行图像放大及 图像的细节信息恢复; 对所述待编码图像的各个图像块在所述参考图像 上进行运动估计和运动补偿, 得到与所述待编码视频图像的各个图像块 对应的预测块; 将所述待编码视频图像的图像块与对应的所述预测块相 减, 得到预测残差块; 对所述预测残差块进行编码处理。
本发明实施例提供的基于图像超分辨率的视频解码方法, 包括: 对 获取的图像编码流信号进行解码得到预测残差块; 利用预训练的紋理字 典库对待解码视频图像进行超分辨率插值处理, 得到参考图像, 所述紋 理字典库包括: 一组或多组字典基, 所述字典基为: 训练图像的高分辨 率图像块以及与所述高分辨率图像块对应的低分辨率图像块组合成的映 射组, 所述超分辨率插值处理包括: 进行图像放大及图像的细节信息恢 复; 对所述待解码视频图像上的各个图像块在所述参考图像上进行运动 补偿,得到与所述各个图像块对应的预测块; 将所述预测块与所述预测 残差块相加得到解码后的视频图像。
本发明实施例提供的基于图像超分辨率的视频编码装置, 包括: 超 分辨率插值处理单元, 用于利用预训练的紋理字典库对待编码视频图像 进行超分辨率插值处理, 得到参考图像, 所述紋理字典库包括: 一组或 多组字典基, 所述字典基包括: 训练图像的高分辨率图像块以及与所述 高分辨率图像块对应的低分辨率图像块组合成的映射组, 所述超分辨率 插值处理包括: 进行图像放大及图像的细节信息恢复; 预测单元, 用于 对所述待编码图像上的各个图像块在参考图像上进行运动估计和运动补 偿, 得到与所述待编码视频图像的各个图像块对应的预测块; 减法计算 单元, 用于将所述待编码视频图像的各图像块与所述运动估计单元估计 得到的对应的预测块相减, 得到预测残差块; 编码单元, 用于对所述减 法计算单元计算得到的所述预测残差块进行编码处理。
本发明实施例提供的基于图像超分辨率的视频解码装置, 包括: 解 码单元, 用于对获取的图像编码流信号进行解码得到预测残差块; 超分 辨率插值处理单元, 用于利用预训练的紋理字典库对待解码视频图像进 行超分辨率插值处理, 得到参考图像, 所述紋理字典库包括: 一组或多 组字典基, 所述字典基包括: 训练图像的高分辨率图像块以及与所述高 分辨率图像块对应的低分辨率图像块, 所述超分辨率插值处理包括: 进 行图像放大及图像的细节信息恢复; 预测单元, 用于对所述待解码视频 图像上的各个图像块在所述参考图像上进行运动补偿, 得到与所述各个 图像块对应的预测块; 加法计算单元, 用于将所述运动补偿处理单元得 到的所述预测块与所述解码单元获取得到的所述预测残差块相加得到解 码后的视频图像。
本发明实施例提供的基于图像超分辨率的视频编码装置, 包括: 数 据输入单元, 用于输入数据; 数据输出单元, 用于输出数据; 存储单元, 用于存储数据, 其中包括可执行的程序; 处理器, 与所述数据输入单元、 数据输出单元及存储单元数据连接, 用于执行所述可执行的程序, 所述 程序的执行包括完成上述方法。
本发明实施例提供的基于图像超分辨率的视频解码装置, 包括: 数 据输入单元, 用于输入数据; 数据输出单元, 用于输出数据; 存储单元, 用于存储数据, 其中包括可执行的程序; 处理器, 与所述数据输入单元、 数据输出单元及存储单元数据连接, 用于执行所述可执行的程序, 所述 程序的执行包括完成上述方法。
从以上技术方案可以看出, 本发明实施例具有以下优点:
本申请提供的基于图像超分辨率的视频编解码方法及装置, 本申请 方法在对待编码和待编码的视频图像进行预测前,先对待编码 /待解码视 频图像进行超分辨率插值处理,可以对待编码 /待解码图像进行放大及进 行细节信息恢复,从而,在对待编码 /待解码图像进行预测得到预测块时, 相比现有技术利用线性插值对视频图像进行预测的方法, 更能有效还原 原图像, 避免出现现有技术中预测块边缘模糊的问题, 从而提升视频图 像预测的准确性, 进而提升视频图像的编码效率。
附图说明
本发明的上述和 /或附加的方面和优点从结合下面附图对实施方式 的描述中将变得明显和容易理解, 其中:
图 1为实施例一的基于图像超分辨率的视频编码方法的流程图; 图 2a-2c为本申请一种实施例中图像块局部紋理结构的特征提取示 意图;
图 3为实施例二的步骤 101的一种实施方式的流程图;
图 4为本实施例三的基于图像超分辨率的视频解码方法流程图; 图 5为实施例三的的步骤 202的一种实施方式的流程图;
图 6为本申请实施例五的装置结构示意图;
图 7为本申请实施例五超分辨率插值处理单元的结构示意图; 图 8为本申请实施例六的装置结构示意图;
图 9为本申请实施例六的超分辨率插值处理单元的结构示意图; 图 10为本申请实施例七的装置结构示意图;
图 11为本申请实施例八的装置结构示意图。
具体实施方式
本申请实施例中, 提供了一种基于图像超分辨率的视频编解码方法 及装置, 可以恢复图像的高频信息, 提高图像的质量, 从而应用于对视 频图像的时域预测, 提升预测的准确性, 进而提高编解码效率。
下面通过具体实施方式结合附图对本申请作进一步详细说明。
实施例一:
请参考图 1 , 图 1为一种实施例中基于图像超分辨率的视频编码方 法的流程图。 如图 1所示, 本实施例提供一种基于图像超分辨率的视频 编码方法, 可以包括以下步骤:
101、利用预训练的紋理字典库对待编码视频图像进行超分辨率插值 处理。
经过超分辨率插值处理后, 得到参考图像。 所述紋理字典库包括: 一组或多组字典基, 所述字典基为: 训练图像的高分辨率图像块以及与 所述高分辨率图像块对应的低分辨率图像块组合的映射组, 所述超分辨 率插值处理包括: 进行图像放大及图像的细节信, ί、恢复。
102、待编码视频图像上的各个图像块在参考图像上进行运动估计和 运动补偿, 得到与各个图像块对应的的预测块。
其中,所述图像块可以按照预设的划分规则在视频图像上进行划分, 例如: 将 2x2个像素划分为一个图像块。 本申请对划分规则仅作举例不 作具体限定。 本实施例步骤中, 可以对待编码视频图像上划分好的各个图像块进 行在参考图像中进行运动估计和运动补偿, 计算各个图像块在参考帧中 的位置偏移和对应像素值, 从而得到经过运动估计后的与待编码视频图 像的各个图像块对应的预测块。
103、将所述待编码视频图像的图像块与对应的预测块相减,得到预 测残差块。
104、 对预测残差块进行编码处理。
本申请实施例一提供的基于图像超分辨率的视频编码方法, 利用预 训练的紋理字典库对待编码视频图像进行超分辨率插值处理, 可以对待 编码图像进行放大及进行细节信息恢复, 再对进行超分辨率插值处理后 的参考图像块进行运动估计, 得到对应的预测块, 然后将预测块与待编 码视频图像相减得到残差块, 再对残差块进行编码。 相比现有技术利用 线性插值对视频图像进行预测的方法, 本申请方法在对待编码的视频图 像进行预测前, 先对待编码视频图像进行超分辨率插值处理, 可以对待 编码图像进行放大及进行细节信息恢复, 这样, 在对待编码图像进行运 动估计处理得到预测块时,避免出现现有技术中预测块边缘模糊的问题, 从而提升预测的准确性, 进而提高编码效率。
一个优选的实施例中, 紋理字典库中各字典基按照各训练图像的高 分辨率图像块的局部特征以及与所述高分辨率图像块对应的低分辨率图 像块的局部特征进行分类,所述局部特征包括局部二值结构(LBS , Local Binary Structure ) 和锐利边缘结构 ( SES , Sharp Edge Structure )。
本实施例中, 紋理字典是预先训练得到的, 紋理字典的预训练可以 釆取以下实施方式:
S 1、 从包含若干个训练图像的训练图像集中选取多个高分辨率局部 图像块, 其中, 高分辨率局部图像块由其所在图像上的至少两个像素点 组成。 对训练图像进行降釆样处理, 获取与每个局部图像块——对应的 低分辨率局部图像块。
52、 提取高分辨率局部图像块的局部特征, 得到高分辨率字典样本 Dh(y) , 以及, 提取与每个所述局部图像块——对应的低分辨率局部图像 块的局部特征,得到低分辨率字典样本 Dl(y) , 将所述高分辨率字典样本 与所述低分辨率字典样本相互映射组合得到一组字典基样本, 所述局部 特征包括 LBS和 SES。
53、 对所述多组字典基样本进行训练, 得到紋理字典库。
下面对本申请实施例一中利用预训练的紋理字典库对视频图像进行 超分辨率插值处理的过程及原理进行举例说明。
如图 2a、 2b和 2c所示, A、 B、 C、 D是局部相邻的四个像素点, 图中, 像素点的高度反应了像素点的灰度值大小。 如图 2a所示, A、 B、 C、 D 四个像素点组成一块平坦局部区域, 因此灰度值大小相等。 如图 2b所示, 像素点 A和 B的灰度值比像素点 C和 D的灰度值高。 本实施 例定义 LBS-Geometry ( LBS— G ) 区分这种几何结构上的不同, LBS-Geometry ( LBS— G ) 的计算方式如公式 ( 1 ):
Figure imgf000007_0001
其中, gp表示局部的第 p个像素点的灰度值 , 8„1∞11是八、 B、 C、 D四个像素点构成的局部的像素均值。 本实施例中以 4个像素点为例进 行举例, 在其它实施例中, 像素点的数量可以为其它数值, 例如 N个, N 为正整数。
如图 2b、 2c所示的局部图像块, 由于灰度差异程度不同, 二者仍然 属于不同的局部模式, 因此本实施例定义 LBS-Difference ( LBS— D ) 来 表示局部灰度差异程度, 可以得到公式 (2 ):
4
LBS_D=∑S (dp-d lobal) 2P-i d] = >mean
P=1 …… (2 ) 其中 dglbal是整幅图像上全部的局部灰度差异的均值。
结合 LBS— G和 LBS— D便组成了完整的局部二值结构描述, 如公式 ) 所示:
LBS二 ¾ S (gp-gmean) 2P+3+ S (dp-dglobal) 2P- 1
p=l p=l ( 3 )
Figure imgf000007_0002
其中, t为预先设置有灰度阔值, 在具体实施例中, t设置为一相对 较大的阔值, 用于区分锐利边缘。
本实施例中, 紋理字典的训练可以釆用 K均值聚类的方式, 得到欠 完备字典, 或者, 紋理字典的训练可以釆用稀疏编码的方式, 得到过完 使用 K均值聚类的方式训练字典时,从特征样本中选取一定数量(例 如十万个)的样本, 使用 K均值聚类算法聚类出若干个类别中心, 用这 些类别中心的集合作为紋理字典库。使用 K均值聚类的方式训练字典可 以建立维数低的欠完备字典库。
一个优选的实施例中, 在对待解码视频图像进行超分辨率插值时, 图像上未知的高分辨率局部图像块 X可以表示为紋理字典库中的多个字 典基的组合:
X « Dh(y) α ( 5 )
其中 y是与高分辨率局部图像块 X 对应的低分辨率局部图像块,
Dh(y)是与 y有相同 LBS和 SES的字典基的高分辨率字典样本, α是表 达系数。
在使用过完备字典时, 系数 α满足稀疏性, 使用低分辨率字典样本 Dl(y)来计算稀疏表达系数 α , 然后将计算得到的表达系数 α代入式(5 ) 中来计算对应的高分辨率局部图像块 X , 因此最优 α的获取可以转化为 以下最优化问题: mm a s. t. FD^-Fy ε
( 6 )
其中 ε为趋于 0的极小值, F是取特征描述子操作, 在本实施例提 供的字典 D中, 取的特征是局部灰度差异结合梯度值大小。 由于 α足够 稀疏, 因此使用 L1范数来代替式 (6 ) 的 L0范数, 最优化问题变为:
Figure imgf000008_0001
其中, λ是一个调节稀疏性和相似性的系数, 最优的稀疏表达系数 α可以通过解上述 Lasso问题获得, 然后代入式( 5 )即可计算出 y对应 的高分辨率局部图像块 X。
在使用欠完备字典时, α不满足足够稀疏性, 使用 k近邻算法找出 最接近 y的 k个字典基 Dl(y),再用与 Dl(y)对应的 k个高分辨率字典 Dh(y) 的线性组合来重建 x。
图像中每个失真的低分辨率局部块 y都重建其清晰的高分辨率图像 块 X后, 就得到了最终的清晰还原图像。
实施例二:
请参考图 3 , 图 3为实施例一中步骤 101的一种实施方式的流程图。 本实施例中, 所述紋理字典中各字典基按照各训练图像的高分辨率图像 块的局部特征以及与所述高分辨率图像块对应的低分辨率图像块的局部 特征进行分类, 所述局部特征包括局部二值结构和锐利边缘结构。
本实施例提供的基于图像超分辨率的视频编码方法的利用预训练的 紋理字典库对待编码视频图像进行超分辨率插值处理, 具体可以包括以 下步骤:
101a, 提取待编码视频图像上各个图像块的局部特征。
101b, 将所述待编码视频图像中各个图像块的局部特征与所述紋理 字典库中各个字典基的局部特征进行匹对, 获取与匹对的字典基。
101 c、 利用所述匹对的字典基对所述待编码视频图像上对应的图像 块进行图像细节信息恢复及图像放大处理。
实施例三:
请参考图 4, 图 4为一种实施例中基于图像超分辨率的视频解码方 法流程图。 如图 4所示, 本实施例提供的基于图像超分辨率的视频解码 方法, 可以包括以下步骤:
201、 对获取的图像编码流信号进行解码得到预测残差块。
202、利用预训练的紋理字典库对待解码视频图像进行超分辨率插值 处理, 得到参考图像, 所述紋理字典库包括: 一组或多组字典基, 所述 字典基包括: 训练图像的高分辨率图像块以及与所述高分辨率图像块对 应的低分辨率图像块组合成的映射组, 所述超分辨率插值处理包括: 进 行图像放大及图像的细节信, 恢复。
203、对所述待解码视频图像上的各个图像块在所述参考图像上进行 运动补偿, 得到预测块。
204、 将预测块与预测残差块相加得到解码后的视频图像。
本申请实施例三提供的基于图像超分辨率的视频解码方法, 对获取 的图像编码流信号进行解码得到预测残差块, 利用预训练的紋理字典库 对待解码视频图像进行超分辨率插值处理, 超分辨率插值处理包括: 进 行图像放大及图像的细节信息恢复, 对进行插值处理后的视频图像进行 运动补偿处理, 得到预测块, 将预测块与预测残差块相加得到待解码视 频图像。 相比现有技术利用线性插值对视频图像进行预测的方法, 本申 请方法在对待解码的视频图像进行预测前, 先对待解码视频图像进行超 分辨率插值处理, 可以对待解码图像进行放大及进行细节信息恢复, 这 样, 在对待解码图像进行运动补偿处理得到预测块时, 不会出现现有技 术中预测块边缘模糊的问题, 从而提升预测的准确性, 进而提高解码效 率。
实施例四:
请参考图 5 , 图 5为实施例三中步骤 202的一种实施方式的流程图。 本实施例中, 所述紋理字典中各字典基按照各训练图像的高分辨率图像 块的局部特征以及与所述高分辨率图像块对应的低分辨率图像块的局部 特征进行分类, 所述局部特征包括局部二值结构和锐利边缘结构。
利用预训练的紋理字典库对待解码视频图像进行超分辨率插值处理 具体可以包括以下步骤:
202a、 提取待解码视频图像上各个图像块的局部特征。
202b, 将各个图像块的的局部特征与所述紋理字典库中各个字典基 的局部特征进行匹对, 获取匹对的字典基。 202c, 利用匹对的字典基对待解码视频图像进行细节信息恢复及图 像放大处理。
实施例五:
请参考图 6, 本实施例相应提供了一种基于图像超分辨率的视频编 码装置, 可以包括:
超分辨率插值处理单元 60,用于利用预训练的紋理字典库对待编码 视频图像进行超分辨率插值处理, 所述紋理字典库包括: 一组或多组字 典基, 所述字典基包括: 训练图像的高分辨率图像块以及与所述高分辨 率图像块对应的低分辨率图像块组合成的映射组, 所述超分辨率插值处 理包括: 进行图像放大及图像的细节信息恢复。
预测单元 61 ,用于对待编码图像上的各个图像块在经过超分辨率插 值处理单元 60 进行超分辨率插值处理后的参考图像上进行运动估计和 运动补偿, 得到与待编码视频图像的各个图像块对应的预测块。
减法计算单元 62 ,用于将所述待编码视频图像的各图像块与运动估 计单元 61估计得到的对应的预测块相减, 得到预测残差块。
编码单元 63 , 用于对减法计算单元 62计算得到的预测残差块进行 编码处理。
一个优选的实施例中, 请参阅图 6 , 所述紋理字典中各字典基按照 各训练图像的高分辨率图像块的局部特征以及与所述高分辨率图像块对 应的低分辨率图像块的局部特征进行分类, 所述局部特征包括局部二值 结构和锐利边缘结构。
请参阅图 7 , 图 7为本申请实施例五超分辨率插值处理单元的结构 示意图, 如图 7所示, 超分辨率插值处理单元 60具体可以包括:
提取模块 601 , 用于提取待编码视频图像上各图像块的局部特征。 匹对模块 602 , 用于将提取模块 601提取的所述待编码视频图像中 各个图像块的局部特征与所述紋理字典库中各个字典基的局部特征进行 匹对, 获取匹对的字典基。
图像处理模块 603 , 用于利用匹对模块 602 匹对出的所述匹对的字 典基对所述待编码视频图像上对应的图像块进行图像细节信息恢复及图 像放大处理。
实施例六:
请参考图 8 ,本实施例提供一种基于图像超分辨率的视频解码装置, 可以包括:
解码单元 70 ,用于对获取的图像编码流信号进行解码得到预测残差 块。
超分辨率插值处理单元 71 ,用于利用预训练的紋理字典库对待解码 视频图像进行超分辨率插值处理,得到参考图像, 所述紋理字典库包括: 一组或多组字典基, 所述字典基包括: 训练图像的高分辨率图像块以及 与所述高分辨率图像块对应的低分辨率图像块组合成的映射组, 所述超 分辨率插值处理包括: 进行图像放大及图像的细节信息恢复。
预测单元 72 ,用于对待解码视频图像上的各个图像块在经过超分辨 率插值处理单元 71进行插值处理后的参考图像图像上进行运动补偿,得 到各个图像块对应的预测块。
加法计算单元 73 , 用于将运动补偿处理单元 72得到的所述预测块 与所述解码单元获取得到的所述预测残差块相加得到待解码后的视频图 像。
一个优选的实施例中, 请参阅图 9 , 图 9为超分辨率插值处理单元 的结构示意图。 所述紋理字典中各字典基按照各训练图像的高分辨率图 像块的局部特征以及与所述高分辨率图像块对应的低分辨率图像块的局 部特征进行分类, 所述局部特征包括局部二值结构和锐利边缘结构。
超分辨率插值处理单元 71包括:
提取模块 710, 用于提取所述待解码视频图像的局部特征。
匹对模块 711 , 用于将提取模块 710提取的待解码视频图像中各个 图像块的的局部特征与所述紋理字典库中各个字典基的局部特征进行匹 对, 获取匹对的字典基。
图像处理模块 712 , 利用匹对模块 711 匹对出的所述匹对的字典基 对待解码视频图像进行细节信息恢复及图像放大处理。
实施例七:
请参考图 10 ,本实施例提供了一种基于图像超分辨率的视频编码系 统, 可以包括:
数据输入单元 80,用于输入数据;数据输出单元 81 ,用于输出数据; 存储单元 82 , 用于存储数据, 其中包括可执行的程序; 处理器 83 , 与数 据输入单元 80、 数据输出单元 81及存储单元数据 82连接, 用于执行所 全 或部分步骤。 、- ― '
实施例八:
请参考图 11 , 本实施例提供了一种基于图像超分辨率的视频解码系 统, 可以包括:
数据输入单元 90,用于输入数据;数据输出单元 91 ,用于输出数据; 存储单元 92 , 用于存储数据, 其中包括可执行的程序; 处理器 93 , 与数 据输入单元 90、 数据输出单元 91及存储单元数据 92连接, 用于执行所 全 或部分步骤。 、- ― '
本领域技术人员可以理解, 上述实施方式中各种方法的全部或部分 步骤可以通过程序来指令相关硬件完成, 该程序可以存储于一计算机可 读存储介质中, 存储介质可以包括: 只读存储器、 随机存储器、 磁盘或 光盘等。
以上所述仅为本发明的较佳实施例, 应当理解, 这些实施例仅用以 解释本发明, 并不用于限定本发明。 对于本领域的一般技术人员, 依据 本发明的思想, 可以对上述具体实施方式进行变化。

Claims

权 利 要 求
1、 一种基于图像超分辨率的视频编码方法, 其特征在于, 包括: 利用预训练的紋理字典库对待编码视频图像进行超分辨率插值处 理, 得到参考图像, 所述紋理字典库包括: 一组或多组字典基, 所述字 典基包括: 训练图像的高分辨率图像块以及与所述高分辨率图像块对应 的低分辨率图像块组合成的映射组, 所述超分辨率插值处理包括: 进行 图像放大及图像的细节信, 恢复;
对所述待编码视频图像上的各个图像块在所述参考图像上进行运动 估计和运动补偿, 得到与所述待编码视频图像的各个图像块对应的预测 块;
将所述待编码视频图像的图像块与对应的所述预测块相减, 得到预 测残差块;
对所述预测残差块进行编码处理。
2、如权利要求 1所述的基于图像超分辨率的视频编码方法,其特征 在于, 所述紋理字典库中各字典基按照各训练图像的高分辨率图像块的 局部特征以及与所述高分辨率图像块对应的低分辨率图像块的局部特征 进行分类, 所述局部特征包括局部二值结构和锐利边缘结构。
3、如权利要求 2所述的基于图像超分辨率的视频编码方法,其特征 在于, 所述利用预训练的紋理字典库对待编码视频图像进行超分辨率插 值处理包括:
提取待编码视频图像上各个图像块的局部特征;
将所述待编码视频图像中各个图像块的局部特征与所述紋理字典库 中各个字典基的局部特征进行匹对, 获取匹对的字典基;
利用所述匹对的字典基对所述待编码视频图像上对应的图像块进行 图像细节信息恢复及图像放大处理。
4、 一种基于图像超分辨率的视频解码方法, 其特征在于, 包括: 对获取的图像编码流信号进行解码得到预测残差块;
利用预训练的紋理字典库对待解码视频图像进行超分辨率插值处 理, 得到参考图像, 所述紋理字典库包括: 一组或多组字典基, 所述字 典基为: 训练图像的高分辨率图像块以及与所述高分辨率图像块对应的 低分辨率图像块组合成的映射组, 所述超分辨率插值处理包括: 进行图 像放大及图像的细节信, 恢复;
对所述待解码视频图像上的各个图像块在所述参考图像上进行运动 补偿, 得到与所述各个图像块对应的预测块;
将所述预测块与所述预测残差块相加得到解码后的视频图像。
5、如权利要求 4所述的基于图像超分辨率的视频解码方法,其特征 在于, 所述紋理字典库中各字典基按照各训练图像的高分辨率图像块的 局部特征以及与所述高分辨率图像块对应的低分辨率图像块的局部特征 进行分类, 所述局部特征包括局部二值结构和锐利边缘结构。
6、如权利要求 5所述的基于图像超分辨率的视频解码方法,其特征 在于, 所述利用预训练的紋理字典库对待解码视频图像进行超分辨率插 值处理包括:
提取所述待解码视频图像各个图像块的局部特征;
将所述待解码视频图像中各个图像块的的局部特征与所述紋理字典 库中各个字典基的局部特征进行匹对, 获取匹对的字典基;
利用所述匹对的字典基对所述待解码视频图像进行细节信息恢复及 图像放大处理。
7、 一种基于图像超分辨率的视频编码装置, 其特征在于, 包括: 超分辨率插值处理单元, 用于利用预训练的紋理字典库对待编码视 频图像进行超分辨率插值处理, 得到参考图像, 所述紋理字典库包括: 一组或多组字典基, 所述字典基包括: 训练图像的高分辨率图像块以及 与所述高分辨率图像块对应的低分辨率图像块组合成的映射组, 所述超 分辨率插值处理包括: 进行图像放大及图像的细节信息恢复;
预测单元, 用于对所述待编码图像上的各个图像块在参考图像上进 行运动估计和运动补偿, 得到与所述待编码视频图像的各个图像块对应 的预测块;
减法计算单元, 用于将所述待编码视频图像的各图像块与所述运动 估计单元估计得到的对应的预测块相减, 得到预测残差块;
编码单元, 用于对所述减法计算单元计算得到的所述预测残差块进 行编码处理。
8、利要求 7所述的基于图像超分辨率的视频编码装置,其特征在于, 所述紋理字典库中各字典基按照各训练图像的高分辨率图像块的局 部特征以及与所述高分辨率图像块对应的低分辨率图像块的局部特征进 行分类, 所述局部特征包括局部二值结构和锐利边缘结构;
所述超分辨率插值处理单元具体包括:
提取模块, 用于提取待编码视频图像上各图像块的局部特征; 匹对模块, 用于将所述提取模块提取的所述待编码视频图像中各个 图像块的局部特征与所述紋理字典库中各个字典基的局部特征进行匹 对, 获取匹对的字典基;
图像处理模块, 用于利用所述匹对模块匹对出的所述匹对的字典基 对所述待编码视频图像上对应的图像块进行图像细节信息恢复及图像放 大处理。
9、 一种基于图像超分辨率的视频解码装置, 其特征在于, 包括: 解码单元,用于对获取的图像编码流信号进行解码得到预测残差块; 超分辨率插值处理单元, 用于利用预训练的紋理字典库对待解码视 频图像进行超分辨率插值处理, 得到参考图像, 所述紋理字典库包括: 一组或多组字典基, 所述字典基为: 训练图像的高分辨率图像块以及与 所述高分辨率图像块对应的低分辨率图像块组合成的映射组, 所述超分 辨率插值处理包括: 进行图像放大及图像的细节信息恢复;
预测单元, 用于对所述待解码视频图像上的各个图像块在所述参考 图像上进行运动补偿, 得到与所述各个图像块对应的预测块;
加法计算单元, 用于将所述运动补偿处理单元得到的所述预测块与 所述解码单元获取得到的所述预测残差块相加得到解码后的视频图像。
10、 利要求 9所述的基于图像超分辨率的视频编码装置, 其特征在 于, 所述紋理字典库中各字典基按照各训练图像的高分辨率图像块的局 部特征以及与所述高分辨率图像块对应的低分辨率图像块的局部特征进 行分类, 所述局部特征包括局部二值结构和锐利边缘结构;
所述超分辨率插值处理单元包括:
提取模块, 用于提取所述待解码视频图像的局部特征;
匹对模块, 用于将所述提取模块提取的所述待解码视频图像中各个 图像块的的局部特征与所述紋理字典库中各个字典基的局部特征进行匹 对, 获取匹对的字典基;
图像处理模块, 利用所述匹对模块匹对出的所述匹对的字典基对所 述待解码视频图像进行细节信息恢复及图像放大处理。
11、 一种基于图像超分辨率的视频编码系统, 其特征在于, 包括: 数据输入单元, 用于输入数据; 数据输出单元, 用于输出数据; 存储单 元, 用于存储数据, 其中包括可执行的程序; 处理器, 与所述数据输入 单元、数据输出单元及存储单元数据连接,用于执行所述可执行的程序, 所述程序的执行包括完成所述权利要求 1-3中的任一所述方法。
12、 一种基于图像超分辨率的视频解码系统, 其特征在于, 包括: 数据输入单元, 用于输入数据; 数据输出单元, 用于输出数据; 存储单 元, 用于存储数据, 其中包括可执行的程序; 处理器, 与所述数据输入 单元、数据输出单元及存储单元数据连接,用于执行所述可执行的程序, 所述程序的执行包括完成所述权利要求 4-6中的任一所述方法。
PCT/CN2014/078613 2014-05-28 2014-05-28 一种基于图像超分辨率的视频编解码方法及装置 WO2015180054A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2014/078613 WO2015180054A1 (zh) 2014-05-28 2014-05-28 一种基于图像超分辨率的视频编解码方法及装置
US15/060,627 US9986255B2 (en) 2014-05-28 2016-03-04 Method and device for video encoding or decoding based on image super-resolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/078613 WO2015180054A1 (zh) 2014-05-28 2014-05-28 一种基于图像超分辨率的视频编解码方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/060,627 Continuation-In-Part US9986255B2 (en) 2014-05-28 2016-03-04 Method and device for video encoding or decoding based on image super-resolution

Publications (1)

Publication Number Publication Date
WO2015180054A1 true WO2015180054A1 (zh) 2015-12-03

Family

ID=54697837

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/078613 WO2015180054A1 (zh) 2014-05-28 2014-05-28 一种基于图像超分辨率的视频编解码方法及装置

Country Status (2)

Country Link
US (1) US9986255B2 (zh)
WO (1) WO2015180054A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408550A (zh) * 2016-09-22 2017-02-15 天津工业大学 一种改进的自适应多字典学习的图像超分辨率重建方法
CN113160044A (zh) * 2020-01-23 2021-07-23 百度在线网络技术(北京)有限公司 深度图像超分辨率方法、训练方法及装置、设备、介质
CN113487481A (zh) * 2021-07-02 2021-10-08 河北工业大学 基于信息构建和多密集残差块的循环视频超分辨率方法

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8724928B2 (en) * 2009-08-31 2014-05-13 Intellectual Ventures Fund 83 Llc Using captured high and low resolution images
CN106529547B (zh) * 2016-10-14 2019-05-03 天津师范大学 一种基于完备局部特征的纹理识别方法
CN108876721A (zh) * 2018-05-31 2018-11-23 东南大学 基于课程学习的超分辨率图像重建方法及系统
US10885608B2 (en) * 2018-06-06 2021-01-05 Adobe Inc. Super-resolution with reference images
CN110097503B (zh) * 2019-04-12 2024-01-19 浙江师范大学 基于邻域回归的超分辨率方法
CN112449140B (zh) 2019-08-29 2021-09-14 华为技术有限公司 视频超分辨率处理方法及装置
JP2021061501A (ja) * 2019-10-04 2021-04-15 シャープ株式会社 動画像変換装置及び方法
KR20210128091A (ko) 2020-04-16 2021-10-26 삼성전자주식회사 스트리밍 시스템 및 인터렉티브 스트리밍 서비스 제공 방법
CN114240748A (zh) * 2021-12-06 2022-03-25 中央广播电视总台 基于局部自回归模型和离散词典的超分辨率方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100158104A1 (en) * 2008-12-23 2010-06-24 National Tsing Hua University (Taiwan) Compression method for display frames of qfhd (quad full high definition) resolution and system thereof
CN102156875A (zh) * 2011-03-25 2011-08-17 西安电子科技大学 基于多任务ksvd字典学习的图像超分辨率重构方法
CN102722865A (zh) * 2012-05-22 2012-10-10 北京工业大学 一种超分辨率稀疏重建方法
CN103049885A (zh) * 2012-12-08 2013-04-17 新疆公众信息产业股份有限公司 一种利用分析性稀疏表示的超分辨率图像重建方法
US20130129207A1 (en) * 2011-11-18 2013-05-23 Dehong Liu Method for Pan-Sharpening Panchromatic and Multispectral Images Using Dictionaries

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4139853B2 (ja) * 2006-08-31 2008-08-27 松下電器産業株式会社 画像処理装置、画像処理方法および画像処理プログラム
US20100086048A1 (en) * 2008-10-03 2010-04-08 Faisal Ishtiaq System and Method for Video Image Processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100158104A1 (en) * 2008-12-23 2010-06-24 National Tsing Hua University (Taiwan) Compression method for display frames of qfhd (quad full high definition) resolution and system thereof
CN102156875A (zh) * 2011-03-25 2011-08-17 西安电子科技大学 基于多任务ksvd字典学习的图像超分辨率重构方法
US20130129207A1 (en) * 2011-11-18 2013-05-23 Dehong Liu Method for Pan-Sharpening Panchromatic and Multispectral Images Using Dictionaries
CN102722865A (zh) * 2012-05-22 2012-10-10 北京工业大学 一种超分辨率稀疏重建方法
CN103049885A (zh) * 2012-12-08 2013-04-17 新疆公众信息产业股份有限公司 一种利用分析性稀疏表示的超分辨率图像重建方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408550A (zh) * 2016-09-22 2017-02-15 天津工业大学 一种改进的自适应多字典学习的图像超分辨率重建方法
CN113160044A (zh) * 2020-01-23 2021-07-23 百度在线网络技术(北京)有限公司 深度图像超分辨率方法、训练方法及装置、设备、介质
CN113160044B (zh) * 2020-01-23 2023-12-26 百度在线网络技术(北京)有限公司 深度图像超分辨率方法、训练方法及装置、设备、介质
CN113487481A (zh) * 2021-07-02 2021-10-08 河北工业大学 基于信息构建和多密集残差块的循环视频超分辨率方法

Also Published As

Publication number Publication date
US20160191940A1 (en) 2016-06-30
US9986255B2 (en) 2018-05-29

Similar Documents

Publication Publication Date Title
WO2015180054A1 (zh) 一种基于图像超分辨率的视频编解码方法及装置
US10685282B2 (en) Machine-learning based video compression
Guo et al. Content-based image retrieval using error diffusion block truncation coding features
Li et al. Face hallucination based on sparse local-pixel structure
Qin et al. Efficient reversible data hiding for VQ-compressed images based on index mapping mechanism
Liu et al. Mutual information regularized identity-aware facial expression recognition in compressed video
Zhang et al. Attention-guided image compression by deep reconstruction of compressive sensed saliency skeleton
WO2015180055A1 (zh) 一种基于分类字典库的超分辨率图像重构方法及装置
CN106503112B (zh) 视频检索方法和装置
WO2015180052A1 (zh) 一种基于字典库的视频编解码方法及装置
Dong et al. A survey on compression domain image and video data processing and analysis techniques
CN104244006A (zh) 一种基于图像超分辨率的视频编解码方法及装置
AU2023286652A1 (en) Model training method, video encoding method, and video decoding method
Liu et al. Hiding multiple images into a single image via joint compressive autoencoders
CN108171325B (zh) 一种多尺度人脸恢复的时序集成网络、编码装置及解码装置
CN104063855A (zh) 一种基于分类字典库的超分辨率图像重构方法及装置
CN112714313A (zh) 图像处理方法、装置、设备和存储介质
CN104053012B (zh) 一种基于字典库的视频编解码方法及装置
Li et al. HCISNet: Higher‐capacity invisible image steganographic network
Liu et al. Joint compressive autoencoders for full-image-to-image hiding
Guo et al. Toward scalable image feature compression: a content-adaptive and diffusion-based approach
Rahmani et al. A novel legitimacy preserving data hiding scheme based on LAS compressed code of VQ index tables
Barzen et al. Accelerated deep lossless image coding with unified paralleleized GPU coding architecture
Lv et al. A novel auxiliary data construction scheme for reversible data hiding in JPEG images
Zhao et al. Image and Graphics: 10th International Conference, ICIG 2019, Beijing, China, August 23–25, 2019, Proceedings, Part III

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14893471

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14893471

Country of ref document: EP

Kind code of ref document: A1