[go: up one dir, main page]

CN114827630B - Method, system, device and medium for learning CU depth division based on frequency domain distribution - Google Patents

Method, system, device and medium for learning CU depth division based on frequency domain distribution Download PDF

Info

Publication number
CN114827630B
CN114827630B CN202210241583.1A CN202210241583A CN114827630B CN 114827630 B CN114827630 B CN 114827630B CN 202210241583 A CN202210241583 A CN 202210241583A CN 114827630 B CN114827630 B CN 114827630B
Authority
CN
China
Prior art keywords
frequency domain
blocks
block
division
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210241583.1A
Other languages
Chinese (zh)
Other versions
CN114827630A (en
Inventor
许皓淇
曹英烈
周智恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Guangzhou City University of Technology
Original Assignee
South China University of Technology SCUT
Guangzhou City University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT, Guangzhou City University of Technology filed Critical South China University of Technology SCUT
Priority to CN202210241583.1A priority Critical patent/CN114827630B/en
Publication of CN114827630A publication Critical patent/CN114827630A/en
Application granted granted Critical
Publication of CN114827630B publication Critical patent/CN114827630B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/96Tree coding, e.g. quad-tree coding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Processing (AREA)

Abstract

本发明公开了一种基于频域分布学习CU深度划分方法、系统、装置及介质,其中方法包括:将图像划分成若干个64x64大小块并进行DCT变换获得频域系数分布矩阵F64;与对应的W64计算概率分数p64,若其小于划分阈值α64则结束向下划分,大于划分阈值α64按照四叉树原则继续向下划分为4个32x32子CU块,同理获得32x32大小频域系数矩阵F32,与W32计算获得概率分数p32,再与划分阈值α32进行判断是否继续划分;以此类推,直至所有CU块提前停止划分或划分块为最小的8x8CU块。本发明通过概率分数和划分阈值判断是否继续划分,不需要对所有情况进行遍历递归,减轻CU深度划分的复杂度,节省大量编码时间,可广泛应用于视频编码技术领域。

Figure 202210241583

The invention discloses a method, system, device and medium for learning CU depth division based on frequency domain distribution, wherein the method includes: dividing an image into several 64x64 blocks and performing DCT transformation to obtain a frequency domain coefficient distribution matrix F64 ; W 64 calculates the probability score p 64 , if it is smaller than the division threshold α 64 , the downward division will end, and if it is greater than the division threshold α 64 , it will continue to be divided into four 32x32 sub-CU blocks according to the principle of quadtree. The domain coefficient matrix F 32 is calculated with W 32 to obtain the probability score p 32 , and then the division threshold α 32 is used to determine whether to continue division; and so on, until all CU blocks are stopped in advance or divided into the smallest 8x8 CU block. The present invention judges whether to continue to divide through the probability score and the division threshold, does not need to traverse and recurse all situations, reduces the complexity of CU depth division, saves a lot of encoding time, and can be widely used in the field of video encoding technology.

Figure 202210241583

Description

基于频域分布学习CU深度划分方法、系统、装置及介质Method, system, device and medium for learning CU depth division based on frequency domain distribution

技术领域Technical Field

本发明涉及人工智能、视频编码技术领域,尤其涉及一种基于频域分布学习CU深度划分方法、系统、装置及介质。The present invention relates to the fields of artificial intelligence and video coding technology, and in particular to a method, system, device and medium for CU depth division based on frequency domain distribution learning.

背景技术Background Art

近年来随着互联网与通信技术的发展,视频流量的快速增长给视频编码技术带来了极大的挑战。In recent years, with the development of the Internet and communication technology, the rapid growth of video traffic has brought great challenges to video coding technology.

在传统的编码框架中(以HEVC为例),任一编码帧在进行后续预测变换量化操作前,通常需要被划分为多个CTU(Code Tree Unit)序列。CTU可以按照四叉树原则不断向下划分成不同大小的CU(Code Unit),尺寸最大为64x64,最小为8x8。CTU的划分好坏决定着后续编码的效率。In the traditional coding framework (taking HEVC as an example), any coded frame usually needs to be divided into multiple CTU (Code Tree Unit) sequences before subsequent prediction transform quantization operations. CTU can be continuously divided into CU (Code Unit) of different sizes according to the quadtree principle, with the maximum size being 64x64 and the minimum being 8x8. The quality of CTU division determines the efficiency of subsequent coding.

为了获得最佳的CTU划分方式,编码程序会采用遍历递归的形式,从64x64的CU不断向下划分至8x8的CU,通过内置的率失真代价函数,对每一种划分方式进行预测,直至选出其中一种最佳的预测情况。In order to obtain the best CTU division method, the encoding program will use a traversal recursive form to continuously divide the CU from 64x64 to 8x8 CU, and predict each division method through the built-in rate-distortion cost function until one of the best predictions is selected.

这样的划分方式对编码时间以及计算资源造成了极大的浪费,且随着视频图像分辨率的增大而显著提升。因此,如何降低CU深度划分的复杂度成为当前行业内热议的问题。This division method causes a huge waste of encoding time and computing resources, and the waste increases significantly with the increase of video image resolution. Therefore, how to reduce the complexity of CU depth division has become a hotly debated issue in the current industry.

发明内容Summary of the invention

为至少一定程度上解决现有技术中存在的技术问题之一,本发明的目的在于提供一种基于频域分布学习CU深度划分方法、系统、装置及介质。In order to solve at least one of the technical problems existing in the prior art to a certain extent, the purpose of the present invention is to provide a method, system, device and medium for learning CU depth division based on frequency domain distribution.

本发明所采用的技术方案是:The technical solution adopted by the present invention is:

一种基于频域分布学习CU深度划分方法,包括以下步骤:A CU depth division method based on frequency domain distribution learning includes the following steps:

获取视频图像,将视频图像划分为若干64x64大小的第一CU块,获取第一CU块的DCT变换的频域系数分布矩阵

Figure GDA0004182369950000011
根据频域系数分布矩阵
Figure GDA0004182369950000012
获取概率分数
Figure GDA0004182369950000013
Get the video image, divide the video image into several 64x64 first CU blocks, and get the frequency domain coefficient distribution matrix of the DCT transform of the first CU block
Figure GDA0004182369950000011
According to the frequency domain coefficient distribution matrix
Figure GDA0004182369950000012
Get probability score
Figure GDA0004182369950000013

Figure GDA0004182369950000014
将第一CU块向下划分成4个32x32大小的第二CU块,获取第二CU块的DCT变换的频域系数分布矩阵
Figure GDA0004182369950000015
根据频域系数分布矩阵
Figure GDA0004182369950000016
获取概率分数
Figure GDA0004182369950000017
反之,结束第一CU块的划分;like
Figure GDA0004182369950000014
Divide the first CU block into four 32x32 second CU blocks, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the second CU block
Figure GDA0004182369950000015
According to the frequency domain coefficient distribution matrix
Figure GDA0004182369950000016
Get probability score
Figure GDA0004182369950000017
Otherwise, the division of the first CU block is terminated;

Figure GDA0004182369950000018
将第二CU块向下划分成4个16x16大小的第三CU块,获取第三CU块的DCT变换的频域系数分布矩阵
Figure GDA0004182369950000019
根据频域系数分布矩阵
Figure GDA00041823699500000110
获取概率分数
Figure GDA00041823699500000111
反之,结束第二CU块的划分;like
Figure GDA0004182369950000018
Divide the second CU block downward into four third CU blocks of size 16x16, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the third CU block
Figure GDA0004182369950000019
According to the frequency domain coefficient distribution matrix
Figure GDA00041823699500000110
Get probability score
Figure GDA00041823699500000111
Otherwise, the division of the second CU block ends;

Figure GDA0004182369950000021
将第三CU块向下划分成4个8x8大小的第四CU块;反之,结束第三CU块的划分;like
Figure GDA0004182369950000021
Divide the third CU block downward into four fourth CU blocks of 8x8 size; otherwise, end the division of the third CU block;

其中,αN为划分阈值,N=64,32,16。Among them, α N is the division threshold, N=64, 32, 16.

进一步地,所述概率分数

Figure GDA0004182369950000022
通过以下公式计算获得:Furthermore, the probability score
Figure GDA0004182369950000022
Calculated by the following formula:

Figure GDA0004182369950000023
Figure GDA0004182369950000023

其中,W64为预设的频域分布权重矩阵,i表示矩阵的行坐标,j表示矩阵的行坐标。Wherein, W 64 is a preset frequency domain distribution weight matrix, i represents the row coordinate of the matrix, and j represents the row coordinate of the matrix.

进一步地,频域分布权重矩阵W64通过以下方式获得:Further, the frequency domain distribution weight matrix W 64 is obtained by:

获取网络所需训练集,样本

Figure GDA0004182369950000024
Figure GDA0004182369950000025
表示第k个64x64大小CU块对应的DCT变换频域系数分布矩阵;Lk=0、1,表示CU块是否继续向下划分,0表示否,1表示是;Get the training set and samples required by the network
Figure GDA0004182369950000024
Figure GDA0004182369950000025
Indicates the DCT transform frequency domain coefficient distribution matrix corresponding to the k-th 64x64 CU block; L k = 0, 1, indicating whether the CU block continues to be divided downward, 0 means no, 1 means yes;

根据预设的损失函数对网络进行训练,获得频域分布权重矩阵W64The network is trained according to a preset loss function to obtain a frequency domain distribution weight matrix W 64 ;

所述预设的损失函数的表达式为:The expression of the preset loss function is:

Figure GDA0004182369950000026
Figure GDA0004182369950000026

进一步地,所述划分阈值α64通过以下方式获得:Further, the division threshold α 64 is obtained by:

选取训练集中L=0的标签样本

Figure GDA0004182369950000027
根据选取的样本计算概率分数Select the labeled samples with L=0 in the training set
Figure GDA0004182369950000027
Calculate the probability score based on the selected sample

Figure GDA0004182369950000028
Figure GDA0004182369950000028

根据计算获得的概率分数

Figure GDA0004182369950000029
获取划分阈值α64。Based on the calculated probability score
Figure GDA0004182369950000029
Get the partition threshold α 64 .

进一步地,划分阈值

Figure GDA00041823699500000210
Furthermore, the threshold
Figure GDA00041823699500000210

进一步地,所述将视频图像划分为若干64x64大小的第一CU块,包括:Furthermore, the step of dividing the video image into a plurality of first CU blocks of 64×64 size includes:

根据亮度分量将视频图像划分为若干64x64大小的第一CU块。The video image is divided into a number of first CU blocks of 64x64 size according to the luminance component.

进一步地,所述将视频图像划分为若干64x64大小的第一CU块,包括:Furthermore, the step of dividing the video image into a plurality of first CU blocks of 64×64 size includes:

将视频图像划分为若干64x64大小的第一CU块后,对剩余不足划分64x64大小的像素区域做像素插值。After the video image is divided into a number of first CU blocks of 64x64 size, pixel interpolation is performed on the remaining pixel areas that are not divided into 64x64 size.

本发明所采用的另一技术方案是:Another technical solution adopted by the present invention is:

一种基于频域分布学习CU深度划分系统,包括:A CU depth partitioning system based on frequency domain distribution learning, comprising:

第一划分模块,获取视频图像,将视频图像划分为若干64x64大小的第一CU块,获取第一CU块的DCT变换的频域系数分布矩阵

Figure GDA0004182369950000031
根据频域系数分布矩阵
Figure GDA0004182369950000032
获取概率分数The first division module obtains a video image, divides the video image into a number of first CU blocks of 64x64 size, and obtains a frequency domain coefficient distribution matrix of the DCT transform of the first CU block
Figure GDA0004182369950000031
According to the frequency domain coefficient distribution matrix
Figure GDA0004182369950000032
Get probability score

Figure GDA0004182369950000033
Figure GDA0004182369950000033

第二划分模块,用于若

Figure GDA0004182369950000034
将第一CU块向下划分成4个32x32大小的第二CU块,获取第二CU块的DCT变换的频域系数分布矩阵
Figure GDA0004182369950000035
根据频域系数分布矩阵
Figure GDA0004182369950000036
获取概率分数
Figure GDA0004182369950000037
反之,结束第一CU块的划分;The second partitioning module is used if
Figure GDA0004182369950000034
Divide the first CU block into four 32x32 second CU blocks, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the second CU block
Figure GDA0004182369950000035
According to the frequency domain coefficient distribution matrix
Figure GDA0004182369950000036
Get probability score
Figure GDA0004182369950000037
Otherwise, the division of the first CU block is terminated;

第三划分模块,用于若

Figure GDA0004182369950000038
将第二CU块向下划分成4个16x16大小的第三CU块,获取第三CU块的DCT变换的频域系数分布矩阵
Figure GDA0004182369950000039
根据频域系数分布矩阵
Figure GDA00041823699500000310
获取概率分数
Figure GDA00041823699500000311
反之,结束第二CU块的划分;The third partitioning module is used if
Figure GDA0004182369950000038
Divide the second CU block downward into four third CU blocks of size 16x16, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the third CU block
Figure GDA0004182369950000039
According to the frequency domain coefficient distribution matrix
Figure GDA00041823699500000310
Get probability score
Figure GDA00041823699500000311
Otherwise, the division of the second CU block ends;

第四划分模块,用于若

Figure GDA00041823699500000312
将第三CU块向下划分成4个8x8大小的第四CU块;反之,结束第三CU块的划分;The fourth partitioning module is used for
Figure GDA00041823699500000312
Divide the third CU block downward into four fourth CU blocks of 8x8 size; otherwise, end the division of the third CU block;

其中,αN为划分阈值,N=64,32,16。Among them, α N is the division threshold, N=64, 32, 16.

本发明所采用的另一技术方案是:Another technical solution adopted by the present invention is:

一种基于频域分布学习CU深度划分装置,包括:A CU depth division device based on frequency domain distribution learning, comprising:

至少一个处理器;at least one processor;

至少一个存储器,用于存储至少一个程序;at least one memory for storing at least one program;

当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现上所述方法。When the at least one program is executed by the at least one processor, the at least one processor implements the above method.

本发明所采用的另一技术方案是:Another technical solution adopted by the present invention is:

一种计算机可读存储介质,其中存储有处理器可执行的程序,所述处理器可执行的程序在由处理器执行时用于执行如上所述方法。A computer-readable storage medium stores a program executable by a processor, wherein the program executable by the processor is used to execute the method described above when executed by the processor.

本发明的有益效果是:本发明通过概率分数和划分阈值判断是否继续划分,获得一种提前终止划分的方式,不需要对所有情况进行遍历递归,减轻CU深度划分的复杂度,节省大量编码时间。The beneficial effects of the present invention are as follows: the present invention determines whether to continue partitioning by probability score and partitioning threshold, obtains a method of terminating partitioning in advance, does not need to traverse recursively for all situations, reduces the complexity of CU depth partitioning, and saves a lot of encoding time.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例或者现有技术中的技术方案,下面对本发明实施例或者现有技术中的相关技术方案附图作以下介绍,应当理解的是,下面介绍中的附图仅仅为了方便清晰表述本发明的技术方案中的部分实施例,对于本领域的技术人员而言,在无需付出创造性劳动的前提下,还可以根据这些附图获取到其他附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the embodiments of the present invention or the drawings of related technical solutions in the prior art are introduced below. It should be understood that the drawings introduced below are only for the convenience of clearly describing some embodiments of the technical solutions of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative work.

图1是本发明实施例中一种基于频域分布学习CU深度划分方法的步骤流程图FIG. 1 is a flowchart of a method for CU depth division based on frequency domain distribution learning in an embodiment of the present invention.

图2是本发明实施例中视频图像的示意图;FIG2 is a schematic diagram of a video image in an embodiment of the present invention;

图3是图2视频图像的CU块64x64DCT变换频域系数分布的直观展示图;FIG3 is a diagram showing the distribution of 64x64 DCT transform frequency domain coefficients of the CU block of the video image of FIG2 ;

图4是本发明实施例中一种基于频域分布学习CU深度快速划分方法的流程示意图;FIG4 is a schematic diagram of a flow chart of a method for fast CU depth division based on frequency domain distribution learning in an embodiment of the present invention;

图5是本发明实施例中学习频域分布权重矩阵W64和划分阈值α64网络训练流程图。FIG. 5 is a flow chart of network training for learning the frequency domain distribution weight matrix W 64 and the partition threshold α 64 in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。对于以下实施例中的步骤编号,其仅为了便于阐述说明而设置,对步骤之间的顺序不做任何限定,实施例中的各步骤的执行顺序均可根据本领域技术人员的理解来进行适应性调整。Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals throughout represent the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and are not to be construed as limitations of the present invention. For the step numbers in the following embodiments, they are only provided for the convenience of explanation, and the order between the steps is not limited in any way, and the execution order of each step in the embodiment can be adaptively adjusted according to the understanding of those skilled in the art.

在本发明的描述中,需要理解的是,涉及到方位描述,例如上、下、前、后、左、右等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。In the description of the present invention, it should be understood that descriptions involving orientations, such as up, down, front, back, left, right, etc., and orientations or positional relationships indicated are based on the orientations or positional relationships shown in the accompanying drawings, and are only for the convenience of describing the present invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be understood as a limitation on the present invention.

在本发明的描述中,若干的含义是一个或者多个,多个的含义是两个以上,大于、小于、超过等理解为不包括本数,以上、以下、以内等理解为包括本数。如果有描述到第一、第二只是用于区分技术特征为目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。In the description of the present invention, "several" means one or more, "more" means more than two, "greater than", "less than", "exceed" etc. are understood as not including the number itself, and "above", "below", "within" etc. are understood as including the number itself. If there is a description of "first" or "second", it is only used for the purpose of distinguishing the technical features, and cannot be understood as indicating or implying the relative importance or implicitly indicating the number of the indicated technical features or implicitly indicating the order of the indicated technical features.

本发明的描述中,除非另有明确的限定,设置、安装、连接等词语应做广义理解,所属技术领域技术人员可以结合技术方案的具体内容合理确定上述词语在本发明中的具体含义。In the description of the present invention, unless otherwise clearly defined, terms such as setting, installing, connecting, etc. should be understood in a broad sense, and technicians in the relevant technical field can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific content of the technical solution.

如图1所示,本实施例提供一种基于频域分布学习CU深度划分方法,包括以下步骤:As shown in FIG1 , this embodiment provides a method for learning CU depth division based on frequency domain distribution, including the following steps:

S101、获取视频图像,将视频图像划分为若干64x64大小的第一CU块,获取第一CU块的DCT变换的频域系数分布矩阵

Figure GDA0004182369950000041
根据频域系数分布矩阵
Figure GDA0004182369950000042
获取概率分数
Figure GDA0004182369950000043
S101, obtain a video image, divide the video image into a number of first CU blocks of 64x64 size, and obtain a frequency domain coefficient distribution matrix of the DCT transform of the first CU block
Figure GDA0004182369950000041
According to the frequency domain coefficient distribution matrix
Figure GDA0004182369950000042
Get probability score
Figure GDA0004182369950000043

选取视频帧图像的亮度分量,划分成N个64x64大小的CU块。若有剩余区域不足64x64像素大小,则对这些区域做像素插值。The brightness component of the video frame image is selected and divided into N CU blocks of 64x64 size. If there are remaining areas less than 64x64 pixels in size, pixel interpolation is performed on these areas.

对第k个64x64大小的CU区域做DCT变换获得频域系数矩阵

Figure GDA0004182369950000044
k=1,2,..,N。计算其划分概率:Perform DCT transformation on the kth 64x64 CU area to obtain the frequency domain coefficient matrix
Figure GDA0004182369950000044
k=1,2,..,N. Calculate its partition probability:

Figure GDA0004182369950000051
Figure GDA0004182369950000051

S102、若

Figure GDA0004182369950000052
将第一CU块向下划分成4个32x32大小的第二CU块,获取第二CU块的DCT变换的频域系数分布矩阵
Figure GDA0004182369950000053
根据频域系数分布矩阵
Figure GDA0004182369950000054
获取概率分数
Figure GDA0004182369950000055
反之,结束第一CU块的划分。S102, if
Figure GDA0004182369950000052
Divide the first CU block into four 32x32 second CU blocks, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the second CU block
Figure GDA0004182369950000053
According to the frequency domain coefficient distribution matrix
Figure GDA0004182369950000054
Get probability score
Figure GDA0004182369950000055
Otherwise, the division of the first CU block is ended.

Figure GDA0004182369950000056
则该CU区域提前停止划分;若
Figure GDA0004182369950000057
则该CU区域(称为LCU_64)按照四叉树原则继续向下划分成4个32x32大小的CU区域。like
Figure GDA0004182369950000056
Then the CU area stops dividing in advance; if
Figure GDA0004182369950000057
The CU area (called LCU_64) is further divided downward into four CU areas of 32x32 size according to the quadtree principle.

对继续向下划分LCU_64,获取第m个32x32大小CU区域的频域系数矩阵

Figure GDA0004182369950000058
m=1,2,3,4。计算其划分概率:Continue to divide LCU_64 downward to obtain the frequency domain coefficient matrix of the mth 32x32 CU area
Figure GDA0004182369950000058
m=1,2,3,4. Calculate its partition probability:

Figure GDA0004182369950000059
Figure GDA0004182369950000059

S103、若

Figure GDA00041823699500000510
将第二CU块向下划分成4个16x16大小的第三CU块,获取第三CU块的DCT变换的频域系数分布矩阵
Figure GDA00041823699500000511
根据频域系数分布矩阵
Figure GDA00041823699500000512
获取概率分数
Figure GDA00041823699500000513
反之,结束第二CU块的划分。S103, if
Figure GDA00041823699500000510
Divide the second CU block downward into four third CU blocks of size 16x16, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the third CU block
Figure GDA00041823699500000511
According to the frequency domain coefficient distribution matrix
Figure GDA00041823699500000512
Get probability score
Figure GDA00041823699500000513
Otherwise, the division of the second CU block is ended.

Figure GDA00041823699500000514
则该CU区域提前停止划分;若
Figure GDA00041823699500000515
则该CU区域(称为LCU_32)按照四叉树原则继续向下划分成4个16x16大小的CU区域。like
Figure GDA00041823699500000514
Then the CU area stops dividing in advance; if
Figure GDA00041823699500000515
The CU area (called LCU_32) is further divided downward into four CU areas of 16x16 size according to the quadtree principle.

同理对继续向下划分LCU_32,获取第n个16x16大小CU区域的频域系数矩阵

Figure GDA00041823699500000516
n=1,2,3,4。计算其划分概率:Similarly, continue to divide LCU_32 downward to obtain the frequency domain coefficient matrix of the nth 16x16 CU area
Figure GDA00041823699500000516
n=1,2,3,4. Calculate the partition probability:

Figure GDA00041823699500000517
Figure GDA00041823699500000517

S104、若

Figure GDA00041823699500000518
将第三CU块向下划分成4个8x8大小的第四CU块;反之,结束第三CU块的划分。S104, if
Figure GDA00041823699500000518
The third CU block is divided downward into four fourth CU blocks of 8x8 size; otherwise, the division of the third CU block is terminated.

Figure GDA00041823699500000519
则该CU区域提前停止划分;若
Figure GDA00041823699500000520
则该CU区域按照四叉树原则继续向下划分成4个8x8大小的CU区域,结束。like
Figure GDA00041823699500000519
Then the CU area stops dividing in advance; if
Figure GDA00041823699500000520
The CU area is then further divided into four 8x8 CU areas according to the quadtree principle, and the process ends.

由上可知,上述方法能够获得一种提前终止划分的方式,不需要对所有情况进行一种遍历递归,减轻CU深度划分的复杂度,节省大量编码时间。同时基于频域学习计算划分的概率分数本质只是对两个矩阵进行运算,处理器处理起来简单方便,能节省很大的计算资源和时间。As can be seen from the above, the above method can obtain a way to terminate the division in advance, without the need to traverse recursively for all situations, reducing the complexity of CU depth division and saving a lot of encoding time. At the same time, the probability score of the division calculated based on frequency domain learning is essentially just an operation on two matrices, which is simple and convenient for the processor to handle, and can save a lot of computing resources and time.

以下结合附图及具体实施例对上述方法进行详细解释说明。The above method is explained in detail below with reference to the accompanying drawings and specific embodiments.

如图4所示,图4为本发明实施例所提供的一种基于频域分布学习CU深度快速划分方法流程图。该方法包括频域分布学习网络模块和CU深度划分判决模块。在进行CU深度划分判决时,需要先通过频域分布学习网络模块学习频域分布权重矩阵WN和划分阈值αN两个重要的参数,N=64,32,16。As shown in Figure 4, Figure 4 is a flow chart of a method for fast CU depth division based on frequency domain distribution learning provided by an embodiment of the present invention. The method includes a frequency domain distribution learning network module and a CU depth division decision module. When making a CU depth division decision, it is necessary to first learn two important parameters, the frequency domain distribution weight matrix W N and the division threshold α N, through the frequency domain distribution learning network module, where N = 64, 32, 16.

图5展示了本发明实施例所提供的学习频域分布权重矩阵W64和划分阈值α64网络训练流程图。对于其余频域分布权重矩阵W32,W16和划分阈值α32,α16而言也是类似的训练流程,不再额外附图。5 shows a network training flow chart for learning the frequency domain distribution weight matrix W 64 and the partition threshold α 64 provided by an embodiment of the present invention. The training process is similar for the other frequency domain distribution weight matrices W 32 , W 16 and the partition thresholds α 32 , α 16 , and no additional figures are shown.

步骤A1:获取网络所需训练集,样本

Figure GDA0004182369950000061
Figure GDA0004182369950000062
表示第k个64x64大小CU块对应的DCT变换频域系数分布矩阵;Lk=0、1,表示该CU块是否继续向下划分,0表示否,1表示是;Step A1: Obtain the training set and samples required by the network
Figure GDA0004182369950000061
Figure GDA0004182369950000062
Indicates the DCT transform frequency domain coefficient distribution matrix corresponding to the k-th 64x64 CU block; L k = 0, 1, indicating whether the CU block continues to be divided downward, 0 means no, 1 means yes;

步骤A2:设置损失函数

Figure GDA0004182369950000063
训练网络,获得频域分布权重矩阵W64;Step A2: Setting the loss function
Figure GDA0004182369950000063
Train the network to obtain the frequency domain distribution weight matrix W 64 ;

步骤A3:根据当前已经学习到的频域分布权重矩阵W64,选取数据集中L=0的标签样本

Figure GDA0004182369950000064
计算划分概率分数
Figure GDA0004182369950000065
Step A3: Select the label sample with L=0 in the data set according to the currently learned frequency domain distribution weight matrix W 64
Figure GDA0004182369950000064
Calculate the partition probability score
Figure GDA0004182369950000065

步骤A4:观察概率分数

Figure GDA0004182369950000066
的分布,选择合适的方式设置划分阈值α64。在本实施例中,选取
Figure GDA0004182369950000067
能够获得较好的实验结果。Step A4: Observe the probability scores
Figure GDA0004182369950000066
The distribution of , select a suitable method to set the segmentation threshold α 64. In this embodiment, select
Figure GDA0004182369950000067
Better experimental results can be obtained.

参见图2和图3,高频成分越多的区域越有向下划分成更小CU块的趋势,而频域分布学习网络模块是正是学习如何表示高频成分丰富度(内容丰富度)与CU划分深度之间的联系。这种联系通过域分布权重矩阵WN与划分阈值αN参数来表示。As shown in Figures 2 and 3, the area with more high-frequency components tends to be divided into smaller CU blocks, and the frequency domain distribution learning network module is to learn how to represent the relationship between the richness of high-frequency components (content richness) and the depth of CU division. This relationship is represented by the domain distribution weight matrix W N and the division threshold α N parameter.

步骤S1:对任意视频图像的亮度分量进行划分若干个64x64大小CU块,求其DCT变换的频域系数分布矩阵

Figure GDA0004182369950000068
根据
Figure GDA0004182369950000069
计算其对应概率分数;Step S1: Divide the brightness component of any video image into several 64x64 CU blocks and calculate the frequency domain coefficient distribution matrix of its DCT transform
Figure GDA0004182369950000068
according to
Figure GDA0004182369950000069
Calculate its corresponding probability score;

步骤S2:判决概率分数

Figure GDA00041823699500000610
与划分阈值α64之间的关系;Step S2: Decision probability score
Figure GDA00041823699500000610
Relationship with the partition threshold α 64 ;

步骤S3:若

Figure GDA00041823699500000611
提前结束该CU块的划分;Step S3: If
Figure GDA00041823699500000611
End the division of the CU block in advance;

步骤S4:若

Figure GDA0004182369950000071
该CU块按照四叉树原则向下划分成4个32x32子CU块;Step S4: If
Figure GDA0004182369950000071
The CU block is divided downward into four 32x32 sub-CU blocks according to the quadtree principle;

步骤S5:对4个32x32子CU块进行DCT变换获得频域系数分布矩阵

Figure GDA0004182369950000072
计算概率分数
Figure GDA0004182369950000073
Step S5: Perform DCT transformation on the four 32x32 sub-CU blocks to obtain the frequency domain coefficient distribution matrix
Figure GDA0004182369950000072
Calculating probability scores
Figure GDA0004182369950000073

步骤S6:判决概率分数

Figure GDA0004182369950000074
与划分阈值α32之间的关系;Step S6: Determine the probability score
Figure GDA0004182369950000074
The relationship between the partition threshold α 32 ;

步骤S7:若

Figure GDA0004182369950000075
提前结束该CU块的划分;Step S7: If
Figure GDA0004182369950000075
End the division of the CU block in advance;

步骤S8:若

Figure GDA0004182369950000076
该CU块按照四叉树原则向下划分成4个16x16子CU块;Step S8: If
Figure GDA0004182369950000076
The CU block is divided downward into four 16x16 sub-CU blocks according to the quadtree principle;

步骤S9:对4个16x16子CU块进行DCT变换获得频域系数分布矩阵

Figure GDA0004182369950000077
计算概率分数
Figure GDA0004182369950000078
Step S9: Perform DCT transformation on the four 16x16 sub-CU blocks to obtain the frequency domain coefficient distribution matrix
Figure GDA0004182369950000077
Calculating probability scores
Figure GDA0004182369950000078

步骤S10:判决概率分数

Figure GDA0004182369950000079
与划分阈值α16之间的关系;Step S10: Determine the probability score
Figure GDA0004182369950000079
The relationship between the partition threshold α 16 ;

步骤S11:若

Figure GDA00041823699500000710
提前结束该CU块的划分;Step S11: If
Figure GDA00041823699500000710
End the division of the CU block in advance;

步骤S12:若

Figure GDA00041823699500000711
该CU块按照四叉树原则向下划分成4个8x8子CU块,结束此次划分。Step S12: If
Figure GDA00041823699500000711
The CU block is divided downward into four 8x8 sub-CU blocks according to the quadtree principle, ending this division.

可以看到步骤S3、S7、S11都有提前结束划分的机会,因此在很多情况下可以避免遍历一个CU块的所有划分方式后再做出决策,能避免编码时间和计算资源的浪费。不同子CU块间的划分是独立互不受影响的,因此程序可以并行处理,同时对多个子CU块进行决策,极大的节省了编码时间。其中S3与S4、S7与S8、S11与S12都仅执行其中一个步骤,即使一个64x64CU块所需划分至最小8x8CU块仅需要进行9次步骤,涉及3次DCT变换步骤、3次矩阵间运算、及三次判断,极大减轻了对处理器计算资源的需求。It can be seen that steps S3, S7, and S11 all have the opportunity to end the division early, so in many cases it is possible to avoid traversing all the division methods of a CU block before making a decision, which can avoid wasting encoding time and computing resources. The divisions between different sub-CU blocks are independent and do not affect each other, so the program can be processed in parallel and make decisions on multiple sub-CU blocks at the same time, which greatly saves encoding time. Among them, S3 and S4, S7 and S8, S11 and S12 only execute one of the steps. Even if a 64x64CU block needs to be divided into a minimum 8x8CU block, only 9 steps are required, involving 3 DCT transformation steps, 3 matrix operations, and three judgments, which greatly reduces the demand for processor computing resources.

在本实施例中的测试实验中,能够获得与传统编码框架HEVC获得相似的CU划分深度结果,在不影响视频质量与码率的前提下,大幅度减少了编码时间。In the test experiment in this embodiment, a CU division depth result similar to that obtained by the traditional coding framework HEVC can be obtained, and the encoding time is greatly reduced without affecting the video quality and bit rate.

本实施例还提供一种基于频域分布学习CU深度划分系统,包括:This embodiment also provides a CU depth division system based on frequency domain distribution learning, including:

第一划分模块,获取视频图像,将视频图像划分为若干64x64大小的第一CU块,获取第一CU块的DCT变换的频域系数分布矩阵

Figure GDA00041823699500000712
根据频域系数分布矩阵
Figure GDA00041823699500000713
获取概率分数The first division module obtains a video image, divides the video image into a number of first CU blocks of 64x64 size, and obtains a frequency domain coefficient distribution matrix of the DCT transform of the first CU block
Figure GDA00041823699500000712
According to the frequency domain coefficient distribution matrix
Figure GDA00041823699500000713
Get probability score

Figure GDA00041823699500000714
Figure GDA00041823699500000714

第二划分模块,用于若

Figure GDA00041823699500000715
将第一CU块向下划分成4个32x32大小的第二CU块,获取第二CU块的DCT变换的频域系数分布矩阵
Figure GDA00041823699500000716
根据频域系数分布矩阵
Figure GDA00041823699500000717
获取概率分数
Figure GDA0004182369950000081
反之,结束第一CU块的划分;The second partitioning module is used if
Figure GDA00041823699500000715
Divide the first CU block into four 32x32 second CU blocks, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the second CU block
Figure GDA00041823699500000716
According to the frequency domain coefficient distribution matrix
Figure GDA00041823699500000717
Get probability score
Figure GDA0004182369950000081
Otherwise, the division of the first CU block is terminated;

第三划分模块,用于若

Figure GDA0004182369950000082
将第二CU块向下划分成4个16x16大小的第三CU块,获取第三CU块的DCT变换的频域系数分布矩阵
Figure GDA0004182369950000083
根据频域系数分布矩阵
Figure GDA0004182369950000084
获取概率分数
Figure GDA0004182369950000085
反之,结束第二CU块的划分;The third partitioning module is used if
Figure GDA0004182369950000082
Divide the second CU block downward into four third CU blocks of size 16x16, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the third CU block
Figure GDA0004182369950000083
According to the frequency domain coefficient distribution matrix
Figure GDA0004182369950000084
Get probability score
Figure GDA0004182369950000085
Otherwise, the division of the second CU block ends;

第四划分模块,用于若

Figure GDA0004182369950000086
将第三CU块向下划分成4个8x8大小的第四CU块;反之,结束第三CU块的划分;The fourth partitioning module is used for
Figure GDA0004182369950000086
Divide the third CU block downward into four fourth CU blocks of 8x8 size; otherwise, end the division of the third CU block;

其中,αN为划分阈值,N=64,32,16。Among them, α N is the division threshold, N=64, 32, 16.

本实施例的一种基于频域分布学习CU深度划分系统,可执行本发明方法实施例所提供的一种基于频域分布学习CU深度划分方法,可执行方法实施例的任意组合实施步骤,具备该方法相应的功能和有益效果。A CU depth division system based on frequency domain distribution learning in this embodiment can execute a CU depth division method based on frequency domain distribution learning provided by the method embodiment of the present invention, can execute any combination of implementation steps of the method embodiment, and has the corresponding functions and beneficial effects of the method.

本实施例还提供一种基于频域分布学习CU深度划分装置,包括:This embodiment also provides a CU depth division device based on frequency domain distribution learning, including:

至少一个处理器;at least one processor;

至少一个存储器,用于存储至少一个程序;at least one memory for storing at least one program;

当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现图1所示方法。When the at least one program is executed by the at least one processor, the at least one processor implements the method shown in FIG. 1 .

本实施例的一种基于频域分布学习CU深度划分装置,可执行本发明方法实施例所提供的一种基于频域分布学习CU深度划分方法,可执行方法实施例的任意组合实施步骤,具备该方法相应的功能和有益效果。A CU depth division device based on frequency domain distribution learning in this embodiment can execute a CU depth division method based on frequency domain distribution learning provided by the method embodiment of the present invention, can execute any combination of implementation steps of the method embodiment, and has the corresponding functions and beneficial effects of the method.

本申请实施例还公开了一种计算机程序产品或计算机程序,该计算机程序产品或计算机程序包括计算机指令,该计算机指令存储在计算机可读存介质中。计算机设备的处理器可以从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令,使得该计算机设备执行图1所示的方法。The present application also discloses a computer program product or a computer program, which includes a computer instruction stored in a computer-readable storage medium. A processor of a computer device can read the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the method shown in FIG1.

本实施例还提供了一种存储介质,存储有可执行本发明方法实施例所提供的一种基于频域分布学习CU深度划分方法的指令或程序,当运行该指令或程序时,可执行方法实施例的任意组合实施步骤,具备该方法相应的功能和有益效果。This embodiment also provides a storage medium storing instructions or programs that can execute a CU depth division method based on frequency domain distribution learning provided by an embodiment of the method of the present invention. When the instructions or programs are run, any combination of implementation steps of the method embodiment can be executed, and the corresponding functions and beneficial effects of the method can be obtained.

在一些可选择的实施例中,在方框图中提到的功能/操作可以不按照操作示图提到的顺序发生。例如,取决于所涉及的功能/操作,连续示出的两个方框实际上可以被大体上同时地执行或所述方框有时能以相反顺序被执行。此外,在本发明的流程图中所呈现和描述的实施例以示例的方式被提供,目的在于提供对技术更全面的理解。所公开的方法不限于本文所呈现的操作和逻辑流程。可选择的实施例是可预期的,其中各种操作的顺序被改变以及其中被描述为较大操作的一部分的子操作被独立地执行。In some selectable embodiments, the function/operation mentioned in the block diagram may not occur in the order mentioned in the operation diagram. For example, depending on the function/operation involved, the two boxes shown in succession can actually be executed substantially simultaneously or the boxes can sometimes be executed in reverse order. In addition, the embodiment presented and described in the flow chart of the present invention is provided by way of example, for the purpose of providing a more comprehensive understanding of technology. The disclosed method is not limited to the operation and logic flow presented herein. Selectable embodiments are expected, wherein the order of various operations is changed and the sub-operation of a part for which is described as a larger operation is performed independently.

此外,虽然在功能性模块的背景下描述了本发明,但应当理解的是,除非另有相反说明,所述的功能和/或特征中的一个或多个可以被集成在单个物理装置和/或软件模块中,或者一个或多个功能和/或特征可以在单独的物理装置或软件模块中被实现。还可以理解的是,有关每个模块的实际实现的详细讨论对于理解本发明是不必要的。更确切地说,考虑到在本文中公开的装置中各种功能模块的属性、功能和内部关系的情况下,在工程师的常规技术内将会了解该模块的实际实现。因此,本领域技术人员运用普通技术就能够在无需过度试验的情况下实现在权利要求书中所阐明的本发明。还可以理解的是,所公开的特定概念仅仅是说明性的,并不意在限制本发明的范围,本发明的范围由所附权利要求书及其等同方案的全部范围来决定。In addition, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise specified, one or more of the functions and/or features described may be integrated into a single physical device and/or software module, or one or more functions and/or features may be implemented in separate physical devices or software modules. It is also understood that a detailed discussion of the actual implementation of each module is unnecessary for understanding the present invention. More specifically, in view of the properties, functions, and internal relationships of the various functional modules in the device disclosed herein, the actual implementation of the module will be understood within the conventional skills of the engineer. Therefore, those skilled in the art can implement the present invention set forth in the claims without excessive experimentation using ordinary techniques. It is also understood that the specific concepts disclosed are merely illustrative and are not intended to limit the scope of the present invention, which is determined by the full scope of the appended claims and their equivalents.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art or the part of the technical solution, can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present invention. The aforementioned storage medium includes: various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.

在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,“计算机可读介质”可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。The logic and/or steps represented in the flowchart or otherwise described herein, for example, can be considered as an ordered list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by an instruction execution system, device or apparatus (such as a computer-based system, a system including a processor, or other system that can fetch instructions from an instruction execution system, device or apparatus and execute instructions), or in conjunction with such instruction execution systems, devices or apparatuses. For the purposes of this specification, "computer-readable medium" can be any device that can contain, store, communicate, propagate or transmit a program for use by an instruction execution system, device or apparatus, or in conjunction with such instruction execution systems, devices or apparatuses.

计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。More specific examples of computer-readable media (a non-exhaustive list) include the following: an electrical connection with one or more wires (electronic device), a portable computer disk case (magnetic device), a random access memory (RAM), a read-only memory (ROM), an erasable and programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disk read-only memory (CDROM). In addition, the computer-readable medium may even be a paper or other suitable medium on which the program is printed, since the program may be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, deciphering or, if necessary, processing in another suitable manner, and then stored in a computer memory.

应当理解,本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that the various parts of the present invention can be implemented by hardware, software, firmware or a combination thereof. In the above-mentioned embodiments, a plurality of steps or methods can be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented by hardware, as in another embodiment, it can be implemented by any one of the following technologies known in the art or their combination: a discrete logic circuit having a logic gate circuit for implementing a logic function for a data signal, a dedicated integrated circuit having a suitable combination of logic gate circuits, a programmable gate array (PGA), a field programmable gate array (FPGA), etc.

在本说明书的上述描述中,参考术语“一个实施方式/实施例”、“另一实施方式/实施例”或“某些实施方式/实施例”等的描述意指结合实施方式或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。In the above description of this specification, the description with reference to the terms "one embodiment/example", "another embodiment/example" or "certain embodiments/examples" etc. means that the specific features, structures, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the present invention. In this specification, the schematic representation of the above terms does not necessarily refer to the same embodiment or example. Moreover, the specific features, structures, materials or characteristics described may be combined in any one or more embodiments or examples in a suitable manner.

尽管已经示出和描述了本发明的实施方式,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施方式进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the present invention, and that the scope of the present invention is defined by the claims and their equivalents.

以上是对本发明的较佳实施进行了具体说明,但本发明并不限于上述实施例,熟悉本领域的技术人员在不违背本发明精神的前提下还可做作出种种的等同变形或替换,这些等同的变形或替换均包含在本申请权利要求所限定的范围内。The above is a specific description of the preferred implementation of the present invention, but the present invention is not limited to the above embodiments. Those skilled in the art may make various equivalent modifications or substitutions without violating the spirit of the present invention. These equivalent modifications or substitutions are all included in the scope defined by the claims of this application.

Claims (8)

1. The frequency domain distribution-based CU depth division learning method is characterized by comprising the following steps of:
acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) of the first CU blocks
Figure FDA0004199156600000011
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000012
Acquiring probability score->
Figure FDA0004199156600000013
wherein ,
Figure FDA0004199156600000014
k on the table indicates a kth first CU block among the plurality of first CU blocks;
if it is
Figure FDA0004199156600000015
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>
Figure FDA0004199156600000016
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000017
Acquiring probability score->
Figure FDA0004199156600000018
Otherwise, ending the division of the first CU block; wherein (1)>
Figure FDA0004199156600000019
M above represents the mth m of the second CU divided downward by the first CU;
if it is
Figure FDA00041991566000000110
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure FDA00041991566000000111
According to the frequency domain coefficient distribution matrix->
Figure FDA00041991566000000112
Acquiring probability score->
Figure FDA00041991566000000113
Otherwise, ending the division of the second CU block; wherein (1)>
Figure FDA00041991566000000114
N on the upper represents the nth of the third CUs divided downward by the second CU;
if it is
Figure FDA00041991566000000115
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;
wherein ,α64 、α 32 、α 16 All are dividing thresholds;
the probability score
Figure FDA00041991566000000116
Obtained by calculation by the following formula:
Figure FDA00041991566000000117
the probability score
Figure FDA00041991566000000118
Obtained by calculation by the following formula:
Figure FDA00041991566000000119
the probability score
Figure FDA00041991566000000120
Obtained by calculation by the following formula:
Figure FDA00041991566000000121
in the formula ,W64 、W 32 、W 16 For a preset frequency domain distribution weight matrix, i represents row coordinates of the matrix, and j represents column coordinates of the matrix.
2. The CU depth partitioning method based on frequency domain distribution learning as set forth in claim 1, wherein the frequency domain distribution weight matrix W 64 Obtained by:
acquiring a training set and a sample required by a network
Figure FDA0004199156600000021
Figure FDA0004199156600000022
Representing a DCT transformation frequency domain coefficient distribution matrix corresponding to a kth 64x64 size CU block; l (L) k Indicating whether the kth 64x64 size CU block continues to partition downward, L k The value 0 indicates no, L k The value 1 indicates yes;
training the network according to a preset loss function to obtain a frequency domain distribution weight matrix W 64
The expression of the preset loss function is as follows:
Figure FDA0004199156600000023
where N represents the number of data in the training set, and N represents the size of the CU block.
3. The CU depth partitioning method based on frequency domain distribution learning as recited in claim 2, wherein the partitioning threshold α 64 Obtained by:
select training set L k Label sample of =0
Figure FDA0004199156600000024
Calculating probability scores from selected samples
Figure FDA0004199156600000025
From calculated probability scores
Figure FDA0004199156600000026
Acquiring the division threshold alpha 64
Division threshold
Figure FDA0004199156600000027
4. The method for frequency domain distribution based CU depth partitioning according to claim 1, wherein the partitioning the video image into a number of first CU blocks of 64x64 size comprises:
the video image is divided into first CU blocks of 64x64 size according to the luminance component.
5. The method for frequency domain distribution based CU depth partitioning according to claim 1, wherein the partitioning the video image into a number of first CU blocks of 64x64 size comprises:
after dividing the video image into a plurality of first CU blocks with 64x64 size, pixel interpolation is performed on the remaining pixel areas with the size of 64x 64.
6. A frequency domain distribution based learning CU depth partitioning system, comprising:
a first dividing module for obtaining video image, dividing the video image into a plurality of first CU blocks with 64x64 size, and obtaining DCT transformed frequency domain coefficient distribution matrix of the first CU blocks
Figure FDA0004199156600000031
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000032
Acquiring probability score->
Figure FDA0004199156600000033
wherein ,
Figure FDA0004199156600000034
K on the table indicates a kth first CU block among the plurality of first CU blocks;
a second dividing module for if
Figure FDA0004199156600000035
Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>
Figure FDA0004199156600000036
According to the frequency domain coefficient distribution matrix->
Figure FDA0004199156600000037
Acquiring probability score->
Figure FDA0004199156600000038
Otherwise, ending the division of the first CU block; wherein (1)>
Figure FDA0004199156600000039
M above represents the mth m of the second CU divided downward by the first CU;
a third dividing module for if
Figure FDA00041991566000000310
Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>
Figure FDA00041991566000000311
According to the frequency domain coefficient distribution matrix->
Figure FDA00041991566000000312
Acquiring probability score->
Figure FDA00041991566000000313
Otherwise, ending the division of the second CU block; wherein (1)>
Figure FDA00041991566000000314
N on the upper represents the nth of the third CUs divided downward by the second CU;
a fourth dividing module forIf it is
Figure FDA00041991566000000315
Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;
wherein ,α64 、α 32 、α 16 All are dividing thresholds;
the probability score
Figure FDA00041991566000000316
Obtained by calculation by the following formula:
Figure FDA00041991566000000317
the probability score
Figure FDA00041991566000000318
Obtained by calculation by the following formula:
Figure FDA00041991566000000319
the probability score
Figure FDA00041991566000000320
Obtained by calculation by the following formula:
Figure FDA00041991566000000321
in the formula ,W64 、W 32 、W 16 For a preset frequency domain distribution weight matrix, i represents row coordinates of the matrix, and j represents column coordinates of the matrix.
7. A frequency domain distribution based learning CU depth partitioning apparatus, comprising:
at least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method of any one of claims 1-5.
8. A computer readable storage medium, in which a processor executable program is stored, characterized in that the processor executable program is for performing the method according to any of claims 1-5 when being executed by a processor.
CN202210241583.1A 2022-03-11 2022-03-11 Method, system, device and medium for learning CU depth division based on frequency domain distribution Active CN114827630B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210241583.1A CN114827630B (en) 2022-03-11 2022-03-11 Method, system, device and medium for learning CU depth division based on frequency domain distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210241583.1A CN114827630B (en) 2022-03-11 2022-03-11 Method, system, device and medium for learning CU depth division based on frequency domain distribution

Publications (2)

Publication Number Publication Date
CN114827630A CN114827630A (en) 2022-07-29
CN114827630B true CN114827630B (en) 2023-06-06

Family

ID=82529378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210241583.1A Active CN114827630B (en) 2022-03-11 2022-03-11 Method, system, device and medium for learning CU depth division based on frequency domain distribution

Country Status (1)

Country Link
CN (1) CN114827630B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018199459A1 (en) * 2017-04-26 2018-11-01 강현인 Image restoration machine learning algorithm using compression parameter, and image restoration method using same

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9756327B2 (en) * 2012-04-03 2017-09-05 Qualcomm Incorporated Quantization matrix and deblocking filter adjustments for video coding
US9432696B2 (en) * 2014-03-17 2016-08-30 Qualcomm Incorporated Systems and methods for low complexity forward transforms using zeroed-out coefficients
EP3376764A4 (en) * 2015-11-12 2019-12-04 LG Electronics Inc. Method and apparatus for coefficient induced intra prediction in image coding system
US10977802B2 (en) * 2018-08-29 2021-04-13 Qualcomm Incorporated Motion assisted image segmentation
US11575896B2 (en) * 2019-12-16 2023-02-07 Panasonic Intellectual Property Corporation Of America Encoder, decoder, encoding method, and decoding method
CN112927202B (en) * 2021-02-25 2022-06-03 华南理工大学 Deepfake video detection method and system combining multiple time domains and multiple features
CN113411582A (en) * 2021-05-10 2021-09-17 华南理工大学 Video coding method, system, device and medium based on active contour

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018199459A1 (en) * 2017-04-26 2018-11-01 강현인 Image restoration machine learning algorithm using compression parameter, and image restoration method using same

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于纹理检测的视频序列误差掩盖;周智恒,谢胜利;计算机工程与应用(第05期);全文 *

Also Published As

Publication number Publication date
CN114827630A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
US11394970B2 (en) Image encoding and decoding method and device
JP2020174374A (en) Digital image recompression
CN111683255B (en) Complexity reduction for significance map coding
US11259029B2 (en) Method, device, apparatus for predicting video coding complexity and storage medium
WO2020207162A1 (en) Intra-frame prediction coding method and apparatus, electronic device and computer storage medium
CN110913225B (en) Image encoding method, image encoding device, electronic device, and computer-readable storage medium
US7133561B2 (en) Palettized image compression
CN113382265A (en) Hardware implementation method, apparatus, medium, and program product for video data entropy coding
CN114827630B (en) Method, system, device and medium for learning CU depth division based on frequency domain distribution
CN110418138A (en) Method for processing video frequency, device, electronic equipment and storage medium
CN111950587A (en) Intra-frame coding block division processing method and hardware device
US20230209066A1 (en) Screen content encoding mode evaluation optimizations
CN110035285B (en) Depth Prediction Method Based on Motion Vector Sensitivity
CN115190295B (en) Video frame processing method, device, equipment and storage medium
CN111787320B (en) Transform coding system and method
US10582207B2 (en) Video processing systems
CN119211561A (en) Video coding and block dividing method, device, equipment and storage medium
WO2025060115A1 (en) Encoding method and apparatus, decoding method and apparatus, encoder, decoder, code stream, and storage medium
CN119544959A (en) Video quality evaluation method, device, computer equipment and storage medium
EP3888365A1 (en) Block-based spatial activity measures for pictures cross-reference to related applications
CN117156134A (en) Dynamic code rate control method, device, computer equipment and storage medium
CN119629350A (en) Method for dividing intra-frame coding block, video coding method and electronic equipment
CN118632005A (en) Image coding method, device, equipment and storage medium
CN117939153A (en) Video encoding method, video encoding device, electronic device, storage medium, and program product
CN117596399A (en) Transformation parameter determining method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant