CN114827630B

CN114827630B - Method, system, device and medium for learning CU depth division based on frequency domain distribution

Info

Publication number: CN114827630B
Application number: CN202210241583.1A
Authority: CN
Inventors: 许皓淇; 曹英烈; 周智恒
Original assignee: South China University of Technology SCUT; Guangzhou City University of Technology
Current assignee: South China University of Technology SCUT; Guangzhou City University of Technology
Priority date: 2022-03-11
Filing date: 2022-03-11
Publication date: 2023-06-06
Anticipated expiration: 2042-03-11
Also published as: CN114827630A

Abstract

The invention discloses a method, system, device and medium for learning CU depth division based on frequency domain distribution, wherein the method includes: dividing an image into several 64x64 blocks and performing DCT transformation to obtain a frequency domain coefficient distribution matrix _F64 ; W ₆₄ calculates the probability score p ₆₄ , if it is smaller than the division threshold α ₆₄ , the downward division will end, and if it is greater than the division threshold α ₆₄ , it will continue to be divided into four 32x32 sub-CU blocks according to the principle of quadtree. The domain coefficient matrix F ₃₂ is calculated with W ₃₂ to obtain the probability score p ₃₂ , and then the division threshold α ₃₂ is used to determine whether to continue division; and so on, until all CU blocks are stopped in advance or divided into the smallest 8x8 CU block. The present invention judges whether to continue to divide through the probability score and the division threshold, does not need to traverse and recurse all situations, reduces the complexity of CU depth division, saves a lot of encoding time, and can be widely used in the field of video encoding technology.

Description

Method, system, device and medium for learning CU depth division based on frequency domain distribution

技术领域Technical Field

本发明涉及人工智能、视频编码技术领域，尤其涉及一种基于频域分布学习CU深度划分方法、系统、装置及介质。The present invention relates to the fields of artificial intelligence and video coding technology, and in particular to a method, system, device and medium for CU depth division based on frequency domain distribution learning.

背景技术Background Art

近年来随着互联网与通信技术的发展，视频流量的快速增长给视频编码技术带来了极大的挑战。In recent years, with the development of the Internet and communication technology, the rapid growth of video traffic has brought great challenges to video coding technology.

在传统的编码框架中(以HEVC为例)，任一编码帧在进行后续预测变换量化操作前，通常需要被划分为多个CTU(Code Tree Unit)序列。CTU可以按照四叉树原则不断向下划分成不同大小的CU(Code Unit)，尺寸最大为64x64，最小为8x8。CTU的划分好坏决定着后续编码的效率。In the traditional coding framework (taking HEVC as an example), any coded frame usually needs to be divided into multiple CTU (Code Tree Unit) sequences before subsequent prediction transform quantization operations. CTU can be continuously divided into CU (Code Unit) of different sizes according to the quadtree principle, with the maximum size being 64x64 and the minimum being 8x8. The quality of CTU division determines the efficiency of subsequent coding.

为了获得最佳的CTU划分方式，编码程序会采用遍历递归的形式，从64x64的CU不断向下划分至8x8的CU，通过内置的率失真代价函数，对每一种划分方式进行预测，直至选出其中一种最佳的预测情况。In order to obtain the best CTU division method, the encoding program will use a traversal recursive form to continuously divide the CU from 64x64 to 8x8 CU, and predict each division method through the built-in rate-distortion cost function until one of the best predictions is selected.

这样的划分方式对编码时间以及计算资源造成了极大的浪费，且随着视频图像分辨率的增大而显著提升。因此，如何降低CU深度划分的复杂度成为当前行业内热议的问题。This division method causes a huge waste of encoding time and computing resources, and the waste increases significantly with the increase of video image resolution. Therefore, how to reduce the complexity of CU depth division has become a hotly debated issue in the current industry.

发明内容Summary of the invention

为至少一定程度上解决现有技术中存在的技术问题之一，本发明的目的在于提供一种基于频域分布学习CU深度划分方法、系统、装置及介质。In order to solve at least one of the technical problems existing in the prior art to a certain extent, the purpose of the present invention is to provide a method, system, device and medium for learning CU depth division based on frequency domain distribution.

本发明所采用的技术方案是：The technical solution adopted by the present invention is:

一种基于频域分布学习CU深度划分方法，包括以下步骤：A CU depth division method based on frequency domain distribution learning includes the following steps:

获取视频图像，将视频图像划分为若干64x64大小的第一CU块，获取第一CU块的DCT变换的频域系数分布矩阵

根据频域系数分布矩阵

获取概率分数

Get the video image, divide the video image into several 64x64 first CU blocks, and get the frequency domain coefficient distribution matrix of the DCT transform of the first CU block

According to the frequency domain coefficient distribution matrix

Get probability score

若

将第一CU块向下划分成4个32x32大小的第二CU块，获取第二CU块的DCT变换的频域系数分布矩阵

根据频域系数分布矩阵

获取概率分数

反之，结束第一CU块的划分；like

Divide the first CU block into four 32x32 second CU blocks, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the second CU block

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the first CU block is terminated;

若

将第二CU块向下划分成4个16x16大小的第三CU块，获取第三CU块的DCT变换的频域系数分布矩阵

根据频域系数分布矩阵

获取概率分数

反之，结束第二CU块的划分；like

Divide the second CU block downward into four third CU blocks of size 16x16, and obtain the frequency domain coefficient distribution matrix of the DCT transform of the third CU block

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the second CU block ends;

若

将第三CU块向下划分成4个8x8大小的第四CU块；反之，结束第三CU块的划分；like

Divide the third CU block downward into four fourth CU blocks of 8x8 size; otherwise, end the division of the third CU block;

其中，α_N为划分阈值，N＝64，32，16。Among them, α _N is the division threshold, N=64, 32, 16.

进一步地，所述概率分数

通过以下公式计算获得：Furthermore, the probability score

Calculated by the following formula:

其中，W₆₄为预设的频域分布权重矩阵，i表示矩阵的行坐标，j表示矩阵的行坐标。Wherein, W ₆₄ is a preset frequency domain distribution weight matrix, i represents the row coordinate of the matrix, and j represents the row coordinate of the matrix.

进一步地，频域分布权重矩阵W₆₄通过以下方式获得：Further, the frequency domain distribution weight matrix W ₆₄ is obtained by:

获取网络所需训练集，样本

表示第k个64x64大小CU块对应的DCT变换频域系数分布矩阵；L_k＝0、1，表示CU块是否继续向下划分，0表示否，1表示是；Get the training set and samples required by the network

Indicates the DCT transform frequency domain coefficient distribution matrix corresponding to the k-th 64x64 CU block; L _k = 0, 1, indicating whether the CU block continues to be divided downward, 0 means no, 1 means yes;

根据预设的损失函数对网络进行训练，获得频域分布权重矩阵W₆₄；The network is trained according to a preset loss function to obtain a frequency domain distribution weight matrix W ₆₄ ;

所述预设的损失函数的表达式为：The expression of the preset loss function is:

进一步地，所述划分阈值α₆₄通过以下方式获得：Further, the division threshold α ₆₄ is obtained by:

选取训练集中L＝0的标签样本

根据选取的样本计算概率分数Select the labeled samples with L=0 in the training set

Calculate the probability score based on the selected sample

根据计算获得的概率分数

获取划分阈值α₆₄。Based on the calculated probability score

Get the partition threshold α ₆₄ .

进一步地，划分阈值

Furthermore, the threshold

进一步地，所述将视频图像划分为若干64x64大小的第一CU块，包括：Furthermore, the step of dividing the video image into a plurality of first CU blocks of 64×64 size includes:

根据亮度分量将视频图像划分为若干64x64大小的第一CU块。The video image is divided into a number of first CU blocks of 64x64 size according to the luminance component.

将视频图像划分为若干64x64大小的第一CU块后，对剩余不足划分64x64大小的像素区域做像素插值。After the video image is divided into a number of first CU blocks of 64x64 size, pixel interpolation is performed on the remaining pixel areas that are not divided into 64x64 size.

本发明所采用的另一技术方案是：Another technical solution adopted by the present invention is:

一种基于频域分布学习CU深度划分系统，包括：A CU depth partitioning system based on frequency domain distribution learning, comprising:

第一划分模块，获取视频图像，将视频图像划分为若干64x64大小的第一CU块，获取第一CU块的DCT变换的频域系数分布矩阵

根据频域系数分布矩阵

获取概率分数The first division module obtains a video image, divides the video image into a number of first CU blocks of 64x64 size, and obtains a frequency domain coefficient distribution matrix of the DCT transform of the first CU block

According to the frequency domain coefficient distribution matrix

Get probability score

第二划分模块，用于若

根据频域系数分布矩阵

获取概率分数

反之，结束第一CU块的划分；The second partitioning module is used if

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the first CU block is terminated;

第三划分模块，用于若

根据频域系数分布矩阵

获取概率分数

反之，结束第二CU块的划分；The third partitioning module is used if

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the second CU block ends;

第四划分模块，用于若

将第三CU块向下划分成4个8x8大小的第四CU块；反之，结束第三CU块的划分；The fourth partitioning module is used for

一种基于频域分布学习CU深度划分装置，包括：A CU depth division device based on frequency domain distribution learning, comprising:

至少一个处理器；at least one processor;

至少一个存储器，用于存储至少一个程序；at least one memory for storing at least one program;

当所述至少一个程序被所述至少一个处理器执行，使得所述至少一个处理器实现上所述方法。When the at least one program is executed by the at least one processor, the at least one processor implements the above method.

一种计算机可读存储介质，其中存储有处理器可执行的程序，所述处理器可执行的程序在由处理器执行时用于执行如上所述方法。A computer-readable storage medium stores a program executable by a processor, wherein the program executable by the processor is used to execute the method described above when executed by the processor.

本发明的有益效果是：本发明通过概率分数和划分阈值判断是否继续划分，获得一种提前终止划分的方式，不需要对所有情况进行遍历递归，减轻CU深度划分的复杂度，节省大量编码时间。The beneficial effects of the present invention are as follows: the present invention determines whether to continue partitioning by probability score and partitioning threshold, obtains a method of terminating partitioning in advance, does not need to traverse recursively for all situations, reduces the complexity of CU depth partitioning, and saves a lot of encoding time.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本发明实施例或者现有技术中的技术方案，下面对本发明实施例或者现有技术中的相关技术方案附图作以下介绍，应当理解的是，下面介绍中的附图仅仅为了方便清晰表述本发明的技术方案中的部分实施例，对于本领域的技术人员而言，在无需付出创造性劳动的前提下，还可以根据这些附图获取到其他附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the embodiments of the present invention or the drawings of related technical solutions in the prior art are introduced below. It should be understood that the drawings introduced below are only for the convenience of clearly describing some embodiments of the technical solutions of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative work.

图1是本发明实施例中一种基于频域分布学习CU深度划分方法的步骤流程图FIG. 1 is a flowchart of a method for CU depth division based on frequency domain distribution learning in an embodiment of the present invention.

图2是本发明实施例中视频图像的示意图；FIG2 is a schematic diagram of a video image in an embodiment of the present invention;

图3是图2视频图像的CU块64x64DCT变换频域系数分布的直观展示图；FIG3 is a diagram showing the distribution of 64x64 DCT transform frequency domain coefficients of the CU block of the video image of FIG2 ;

图4是本发明实施例中一种基于频域分布学习CU深度快速划分方法的流程示意图；FIG4 is a schematic diagram of a flow chart of a method for fast CU depth division based on frequency domain distribution learning in an embodiment of the present invention;

图5是本发明实施例中学习频域分布权重矩阵W₆₄和划分阈值α₆₄网络训练流程图。FIG. 5 is a flow chart of network training for learning the frequency domain distribution weight matrix W ₆₄ and the partition threshold α ₆₄ in an embodiment of the present invention.

具体实施方式DETAILED DESCRIPTION

下面详细描述本发明的实施例，所述实施例的示例在附图中示出，其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的，仅用于解释本发明，而不能理解为对本发明的限制。对于以下实施例中的步骤编号，其仅为了便于阐述说明而设置，对步骤之间的顺序不做任何限定，实施例中的各步骤的执行顺序均可根据本领域技术人员的理解来进行适应性调整。Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals throughout represent the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and are not to be construed as limitations of the present invention. For the step numbers in the following embodiments, they are only provided for the convenience of explanation, and the order between the steps is not limited in any way, and the execution order of each step in the embodiment can be adaptively adjusted according to the understanding of those skilled in the art.

在本发明的描述中，需要理解的是，涉及到方位描述，例如上、下、前、后、左、右等指示的方位或位置关系为基于附图所示的方位或位置关系，仅是为了便于描述本发明和简化描述，而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本发明的限制。In the description of the present invention, it should be understood that descriptions involving orientations, such as up, down, front, back, left, right, etc., and orientations or positional relationships indicated are based on the orientations or positional relationships shown in the accompanying drawings, and are only for the convenience of describing the present invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be understood as a limitation on the present invention.

在本发明的描述中，若干的含义是一个或者多个，多个的含义是两个以上，大于、小于、超过等理解为不包括本数，以上、以下、以内等理解为包括本数。如果有描述到第一、第二只是用于区分技术特征为目的，而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量或者隐含指明所指示的技术特征的先后关系。In the description of the present invention, "several" means one or more, "more" means more than two, "greater than", "less than", "exceed" etc. are understood as not including the number itself, and "above", "below", "within" etc. are understood as including the number itself. If there is a description of "first" or "second", it is only used for the purpose of distinguishing the technical features, and cannot be understood as indicating or implying the relative importance or implicitly indicating the number of the indicated technical features or implicitly indicating the order of the indicated technical features.

本发明的描述中，除非另有明确的限定，设置、安装、连接等词语应做广义理解，所属技术领域技术人员可以结合技术方案的具体内容合理确定上述词语在本发明中的具体含义。In the description of the present invention, unless otherwise clearly defined, terms such as setting, installing, connecting, etc. should be understood in a broad sense, and technicians in the relevant technical field can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific content of the technical solution.

如图1所示，本实施例提供一种基于频域分布学习CU深度划分方法，包括以下步骤：As shown in FIG1 , this embodiment provides a method for learning CU depth division based on frequency domain distribution, including the following steps:

S101、获取视频图像，将视频图像划分为若干64x64大小的第一CU块，获取第一CU块的DCT变换的频域系数分布矩阵

根据频域系数分布矩阵

获取概率分数

S101, obtain a video image, divide the video image into a number of first CU blocks of 64x64 size, and obtain a frequency domain coefficient distribution matrix of the DCT transform of the first CU block

According to the frequency domain coefficient distribution matrix

Get probability score

选取视频帧图像的亮度分量，划分成N个64x64大小的CU块。若有剩余区域不足64x64像素大小，则对这些区域做像素插值。The brightness component of the video frame image is selected and divided into N CU blocks of 64x64 size. If there are remaining areas less than 64x64 pixels in size, pixel interpolation is performed on these areas.

对第k个64x64大小的CU区域做DCT变换获得频域系数矩阵

k＝1,2,..,N。计算其划分概率：Perform DCT transformation on the kth 64x64 CU area to obtain the frequency domain coefficient matrix

k＝1,2,..,N. Calculate its partition probability:

S102、若

根据频域系数分布矩阵

获取概率分数

反之，结束第一CU块的划分。S102, if

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the first CU block is ended.

若

则该CU区域提前停止划分；若

则该CU区域(称为LCU_64)按照四叉树原则继续向下划分成4个32x32大小的CU区域。like

Then the CU area stops dividing in advance; if

The CU area (called LCU_64) is further divided downward into four CU areas of 32x32 size according to the quadtree principle.

对继续向下划分LCU_64，获取第m个32x32大小CU区域的频域系数矩阵

m＝1,2,3,4。计算其划分概率：Continue to divide LCU_64 downward to obtain the frequency domain coefficient matrix of the mth 32x32 CU area

m＝1,2,3,4. Calculate its partition probability:

S103、若

根据频域系数分布矩阵

获取概率分数

反之，结束第二CU块的划分。S103, if

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the second CU block is ended.

若

则该CU区域提前停止划分；若

则该CU区域(称为LCU_32)按照四叉树原则继续向下划分成4个16x16大小的CU区域。like

Then the CU area stops dividing in advance; if

The CU area (called LCU_32) is further divided downward into four CU areas of 16x16 size according to the quadtree principle.

同理对继续向下划分LCU_32，获取第n个16x16大小CU区域的频域系数矩阵

n＝1,2,3,4。计算其划分概率：Similarly, continue to divide LCU_32 downward to obtain the frequency domain coefficient matrix of the nth 16x16 CU area

n＝1,2,3,4. Calculate the partition probability:

S104、若

将第三CU块向下划分成4个8x8大小的第四CU块；反之，结束第三CU块的划分。S104, if

The third CU block is divided downward into four fourth CU blocks of 8x8 size; otherwise, the division of the third CU block is terminated.

若

则该CU区域提前停止划分；若

则该CU区域按照四叉树原则继续向下划分成4个8x8大小的CU区域，结束。like

Then the CU area stops dividing in advance; if

The CU area is then further divided into four 8x8 CU areas according to the quadtree principle, and the process ends.

由上可知，上述方法能够获得一种提前终止划分的方式，不需要对所有情况进行一种遍历递归，减轻CU深度划分的复杂度，节省大量编码时间。同时基于频域学习计算划分的概率分数本质只是对两个矩阵进行运算，处理器处理起来简单方便，能节省很大的计算资源和时间。As can be seen from the above, the above method can obtain a way to terminate the division in advance, without the need to traverse recursively for all situations, reducing the complexity of CU depth division and saving a lot of encoding time. At the same time, the probability score of the division calculated based on frequency domain learning is essentially just an operation on two matrices, which is simple and convenient for the processor to handle, and can save a lot of computing resources and time.

以下结合附图及具体实施例对上述方法进行详细解释说明。The above method is explained in detail below with reference to the accompanying drawings and specific embodiments.

如图4所示，图4为本发明实施例所提供的一种基于频域分布学习CU深度快速划分方法流程图。该方法包括频域分布学习网络模块和CU深度划分判决模块。在进行CU深度划分判决时，需要先通过频域分布学习网络模块学习频域分布权重矩阵W_N和划分阈值α_N两个重要的参数，N＝64，32，16。As shown in Figure 4, Figure 4 is a flow chart of a method for fast CU depth division based on frequency domain distribution learning provided by an embodiment of the present invention. The method includes a frequency domain distribution learning network module and a CU depth division decision module. When making a CU depth division decision, it is necessary to first learn two important parameters, the frequency domain distribution weight matrix W _N and the division threshold α _N, through the frequency domain distribution learning network module, where N = 64, 32, 16.

图5展示了本发明实施例所提供的学习频域分布权重矩阵W₆₄和划分阈值α₆₄网络训练流程图。对于其余频域分布权重矩阵W₃₂，W₁₆和划分阈值α₃₂，α₁₆而言也是类似的训练流程，不再额外附图。5 shows a network training flow chart for learning the frequency domain distribution weight matrix W ₆₄ and the partition threshold α ₆₄ provided by an embodiment of the present invention. The training process is similar for the other frequency domain distribution weight matrices W ₃₂ , W ₁₆ and the partition thresholds α ₃₂ , α ₁₆ , and no additional figures are shown.

步骤A1：获取网络所需训练集，样本

表示第k个64x64大小CU块对应的DCT变换频域系数分布矩阵；L_k＝0、1，表示该CU块是否继续向下划分，0表示否，1表示是；Step A1: Obtain the training set and samples required by the network

步骤A2：设置损失函数

训练网络，获得频域分布权重矩阵W₆₄；Step A2: Setting the loss function

Train the network to obtain the frequency domain distribution weight matrix W ₆₄ ;

步骤A3：根据当前已经学习到的频域分布权重矩阵W₆₄，选取数据集中L＝0的标签样本

计算划分概率分数

Step A3: Select the label sample with L=0 in the data set according to the currently learned frequency domain distribution weight matrix W ₆₄

Calculate the partition probability score

步骤A4：观察概率分数

的分布，选择合适的方式设置划分阈值α₆₄。在本实施例中，选取

能够获得较好的实验结果。Step A4: Observe the probability scores

The distribution of , select a suitable method to set the segmentation threshold α _64. In this embodiment, select

Better experimental results can be obtained.

参见图2和图3，高频成分越多的区域越有向下划分成更小CU块的趋势，而频域分布学习网络模块是正是学习如何表示高频成分丰富度(内容丰富度)与CU划分深度之间的联系。这种联系通过域分布权重矩阵W_N与划分阈值α_N参数来表示。As shown in Figures 2 and 3, the area with more high-frequency components tends to be divided into smaller CU blocks, and the frequency domain distribution learning network module is to learn how to represent the relationship between the richness of high-frequency components (content richness) and the depth of CU division. This relationship is represented by the domain distribution weight matrix W _N and the division threshold α _N parameter.

步骤S1：对任意视频图像的亮度分量进行划分若干个64x64大小CU块，求其DCT变换的频域系数分布矩阵

根据

计算其对应概率分数；Step S1: Divide the brightness component of any video image into several 64x64 CU blocks and calculate the frequency domain coefficient distribution matrix of its DCT transform

according to

Calculate its corresponding probability score;

步骤S2：判决概率分数

与划分阈值α₆₄之间的关系；Step S2: Decision probability score

Relationship with the partition threshold α ₆₄ ;

步骤S3：若

提前结束该CU块的划分；Step S3: If

End the division of the CU block in advance;

步骤S4：若

该CU块按照四叉树原则向下划分成4个32x32子CU块；Step S4: If

The CU block is divided downward into four 32x32 sub-CU blocks according to the quadtree principle;

步骤S5：对4个32x32子CU块进行DCT变换获得频域系数分布矩阵

计算概率分数

Step S5: Perform DCT transformation on the four 32x32 sub-CU blocks to obtain the frequency domain coefficient distribution matrix

Calculating probability scores

步骤S6：判决概率分数

与划分阈值α₃₂之间的关系；Step S6: Determine the probability score

The relationship between the partition threshold α ₃₂ ;

步骤S7：若

提前结束该CU块的划分；Step S7: If

End the division of the CU block in advance;

步骤S8：若

该CU块按照四叉树原则向下划分成4个16x16子CU块；Step S8: If

The CU block is divided downward into four 16x16 sub-CU blocks according to the quadtree principle;

步骤S9：对4个16x16子CU块进行DCT变换获得频域系数分布矩阵

计算概率分数

Step S9: Perform DCT transformation on the four 16x16 sub-CU blocks to obtain the frequency domain coefficient distribution matrix

Calculating probability scores

步骤S10：判决概率分数

与划分阈值α₁₆之间的关系；Step S10: Determine the probability score

The relationship between the partition threshold α ₁₆ ;

步骤S11：若

提前结束该CU块的划分；Step S11: If

End the division of the CU block in advance;

步骤S12：若

该CU块按照四叉树原则向下划分成4个8x8子CU块,结束此次划分。Step S12: If

The CU block is divided downward into four 8x8 sub-CU blocks according to the quadtree principle, ending this division.

可以看到步骤S3、S7、S11都有提前结束划分的机会，因此在很多情况下可以避免遍历一个CU块的所有划分方式后再做出决策，能避免编码时间和计算资源的浪费。不同子CU块间的划分是独立互不受影响的，因此程序可以并行处理，同时对多个子CU块进行决策，极大的节省了编码时间。其中S3与S4、S7与S8、S11与S12都仅执行其中一个步骤，即使一个64x64CU块所需划分至最小8x8CU块仅需要进行9次步骤，涉及3次DCT变换步骤、3次矩阵间运算、及三次判断，极大减轻了对处理器计算资源的需求。It can be seen that steps S3, S7, and S11 all have the opportunity to end the division early, so in many cases it is possible to avoid traversing all the division methods of a CU block before making a decision, which can avoid wasting encoding time and computing resources. The divisions between different sub-CU blocks are independent and do not affect each other, so the program can be processed in parallel and make decisions on multiple sub-CU blocks at the same time, which greatly saves encoding time. Among them, S3 and S4, S7 and S8, S11 and S12 only execute one of the steps. Even if a 64x64CU block needs to be divided into a minimum 8x8CU block, only 9 steps are required, involving 3 DCT transformation steps, 3 matrix operations, and three judgments, which greatly reduces the demand for processor computing resources.

在本实施例中的测试实验中，能够获得与传统编码框架HEVC获得相似的CU划分深度结果，在不影响视频质量与码率的前提下，大幅度减少了编码时间。In the test experiment in this embodiment, a CU division depth result similar to that obtained by the traditional coding framework HEVC can be obtained, and the encoding time is greatly reduced without affecting the video quality and bit rate.

本实施例还提供一种基于频域分布学习CU深度划分系统，包括：This embodiment also provides a CU depth division system based on frequency domain distribution learning, including:

根据频域系数分布矩阵

According to the frequency domain coefficient distribution matrix

Get probability score

第二划分模块，用于若

根据频域系数分布矩阵

获取概率分数

反之，结束第一CU块的划分；The second partitioning module is used if

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the first CU block is terminated;

第三划分模块，用于若

根据频域系数分布矩阵

获取概率分数

反之，结束第二CU块的划分；The third partitioning module is used if

According to the frequency domain coefficient distribution matrix

Get probability score

Otherwise, the division of the second CU block ends;

第四划分模块，用于若

本实施例的一种基于频域分布学习CU深度划分系统，可执行本发明方法实施例所提供的一种基于频域分布学习CU深度划分方法，可执行方法实施例的任意组合实施步骤，具备该方法相应的功能和有益效果。A CU depth division system based on frequency domain distribution learning in this embodiment can execute a CU depth division method based on frequency domain distribution learning provided by the method embodiment of the present invention, can execute any combination of implementation steps of the method embodiment, and has the corresponding functions and beneficial effects of the method.

本实施例还提供一种基于频域分布学习CU深度划分装置，包括：This embodiment also provides a CU depth division device based on frequency domain distribution learning, including:

至少一个处理器；at least one processor;

当所述至少一个程序被所述至少一个处理器执行，使得所述至少一个处理器实现图1所示方法。When the at least one program is executed by the at least one processor, the at least one processor implements the method shown in FIG. 1 .

本实施例的一种基于频域分布学习CU深度划分装置，可执行本发明方法实施例所提供的一种基于频域分布学习CU深度划分方法，可执行方法实施例的任意组合实施步骤，具备该方法相应的功能和有益效果。A CU depth division device based on frequency domain distribution learning in this embodiment can execute a CU depth division method based on frequency domain distribution learning provided by the method embodiment of the present invention, can execute any combination of implementation steps of the method embodiment, and has the corresponding functions and beneficial effects of the method.

本申请实施例还公开了一种计算机程序产品或计算机程序，该计算机程序产品或计算机程序包括计算机指令，该计算机指令存储在计算机可读存介质中。计算机设备的处理器可以从计算机可读存储介质读取该计算机指令，处理器执行该计算机指令，使得该计算机设备执行图1所示的方法。The present application also discloses a computer program product or a computer program, which includes a computer instruction stored in a computer-readable storage medium. A processor of a computer device can read the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the computer device executes the method shown in FIG1.

本实施例还提供了一种存储介质，存储有可执行本发明方法实施例所提供的一种基于频域分布学习CU深度划分方法的指令或程序，当运行该指令或程序时，可执行方法实施例的任意组合实施步骤，具备该方法相应的功能和有益效果。This embodiment also provides a storage medium storing instructions or programs that can execute a CU depth division method based on frequency domain distribution learning provided by an embodiment of the method of the present invention. When the instructions or programs are run, any combination of implementation steps of the method embodiment can be executed, and the corresponding functions and beneficial effects of the method can be obtained.

在一些可选择的实施例中，在方框图中提到的功能/操作可以不按照操作示图提到的顺序发生。例如，取决于所涉及的功能/操作，连续示出的两个方框实际上可以被大体上同时地执行或所述方框有时能以相反顺序被执行。此外，在本发明的流程图中所呈现和描述的实施例以示例的方式被提供，目的在于提供对技术更全面的理解。所公开的方法不限于本文所呈现的操作和逻辑流程。可选择的实施例是可预期的，其中各种操作的顺序被改变以及其中被描述为较大操作的一部分的子操作被独立地执行。In some selectable embodiments, the function/operation mentioned in the block diagram may not occur in the order mentioned in the operation diagram. For example, depending on the function/operation involved, the two boxes shown in succession can actually be executed substantially simultaneously or the boxes can sometimes be executed in reverse order. In addition, the embodiment presented and described in the flow chart of the present invention is provided by way of example, for the purpose of providing a more comprehensive understanding of technology. The disclosed method is not limited to the operation and logic flow presented herein. Selectable embodiments are expected, wherein the order of various operations is changed and the sub-operation of a part for which is described as a larger operation is performed independently.

此外，虽然在功能性模块的背景下描述了本发明，但应当理解的是，除非另有相反说明，所述的功能和/或特征中的一个或多个可以被集成在单个物理装置和/或软件模块中，或者一个或多个功能和/或特征可以在单独的物理装置或软件模块中被实现。还可以理解的是，有关每个模块的实际实现的详细讨论对于理解本发明是不必要的。更确切地说，考虑到在本文中公开的装置中各种功能模块的属性、功能和内部关系的情况下，在工程师的常规技术内将会了解该模块的实际实现。因此，本领域技术人员运用普通技术就能够在无需过度试验的情况下实现在权利要求书中所阐明的本发明。还可以理解的是，所公开的特定概念仅仅是说明性的，并不意在限制本发明的范围，本发明的范围由所附权利要求书及其等同方案的全部范围来决定。In addition, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise specified, one or more of the functions and/or features described may be integrated into a single physical device and/or software module, or one or more functions and/or features may be implemented in separate physical devices or software modules. It is also understood that a detailed discussion of the actual implementation of each module is unnecessary for understanding the present invention. More specifically, in view of the properties, functions, and internal relationships of the various functional modules in the device disclosed herein, the actual implementation of the module will be understood within the conventional skills of the engineer. Therefore, those skilled in the art can implement the present invention set forth in the claims without excessive experimentation using ordinary techniques. It is also understood that the specific concepts disclosed are merely illustrative and are not intended to limit the scope of the present invention, which is determined by the full scope of the appended claims and their equivalents.

所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art or the part of the technical solution, can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several instructions for enabling a computer device (which can be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present invention. The aforementioned storage medium includes: various media that can store program codes, such as a USB flash drive, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.

在流程图中表示或在此以其他方式描述的逻辑和/或步骤，例如，可以被认为是用于实现逻辑功能的可执行指令的定序列表，可以具体实现在任何计算机可读介质中，以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用，或结合这些指令执行系统、装置或设备而使用。就本说明书而言，“计算机可读介质”可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。The logic and/or steps represented in the flowchart or otherwise described herein, for example, can be considered as an ordered list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by an instruction execution system, device or apparatus (such as a computer-based system, a system including a processor, or other system that can fetch instructions from an instruction execution system, device or apparatus and execute instructions), or in conjunction with such instruction execution systems, devices or apparatuses. For the purposes of this specification, "computer-readable medium" can be any device that can contain, store, communicate, propagate or transmit a program for use by an instruction execution system, device or apparatus, or in conjunction with such instruction execution systems, devices or apparatuses.

计算机可读介质的更具体的示例(非穷尽性列表)包括以下：具有一个或多个布线的电连接部(电子装置)，便携式计算机盘盒(磁装置)，随机存取存储器(RAM)，只读存储器(ROM)，可擦除可编辑只读存储器(EPROM或闪速存储器)，光纤装置，以及便携式光盘只读存储器(CDROM)。另外，计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质，因为可以例如通过对纸或其他介质进行光学扫描，接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序，然后将其存储在计算机存储器中。More specific examples of computer-readable media (a non-exhaustive list) include the following: an electrical connection with one or more wires (electronic device), a portable computer disk case (magnetic device), a random access memory (RAM), a read-only memory (ROM), an erasable and programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disk read-only memory (CDROM). In addition, the computer-readable medium may even be a paper or other suitable medium on which the program is printed, since the program may be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, deciphering or, if necessary, processing in another suitable manner, and then stored in a computer memory.

应当理解，本发明的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中，多个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。例如，如果用硬件来实现，和在另一实施方式中一样，可用本领域公知的下列技术中的任一项或他们的组合来实现：具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路，具有合适的组合逻辑门电路的专用集成电路，可编程门阵列(PGA)，现场可编程门阵列(FPGA)等。It should be understood that the various parts of the present invention can be implemented by hardware, software, firmware or a combination thereof. In the above-mentioned embodiments, a plurality of steps or methods can be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented by hardware, as in another embodiment, it can be implemented by any one of the following technologies known in the art or their combination: a discrete logic circuit having a logic gate circuit for implementing a logic function for a data signal, a dedicated integrated circuit having a suitable combination of logic gate circuits, a programmable gate array (PGA), a field programmable gate array (FPGA), etc.

在本说明书的上述描述中，参考术语“一个实施方式/实施例”、“另一实施方式/实施例”或“某些实施方式/实施例”等的描述意指结合实施方式或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施方式或示例中。在本说明书中，对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且，描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。In the above description of this specification, the description with reference to the terms "one embodiment/example", "another embodiment/example" or "certain embodiments/examples" etc. means that the specific features, structures, materials or characteristics described in conjunction with the embodiment or example are included in at least one embodiment or example of the present invention. In this specification, the schematic representation of the above terms does not necessarily refer to the same embodiment or example. Moreover, the specific features, structures, materials or characteristics described may be combined in any one or more embodiments or examples in a suitable manner.

尽管已经示出和描述了本发明的实施方式，本领域的普通技术人员可以理解：在不脱离本发明的原理和宗旨的情况下可以对这些实施方式进行多种变化、修改、替换和变型，本发明的范围由权利要求及其等同物限定。Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that various changes, modifications, substitutions and variations may be made to the embodiments without departing from the principles and spirit of the present invention, and that the scope of the present invention is defined by the claims and their equivalents.

以上是对本发明的较佳实施进行了具体说明，但本发明并不限于上述实施例，熟悉本领域的技术人员在不违背本发明精神的前提下还可做作出种种的等同变形或替换，这些等同的变形或替换均包含在本申请权利要求所限定的范围内。The above is a specific description of the preferred implementation of the present invention, but the present invention is not limited to the above embodiments. Those skilled in the art may make various equivalent modifications or substitutions without violating the spirit of the present invention. These equivalent modifications or substitutions are all included in the scope defined by the claims of this application.

Claims

1. The frequency domain distribution-based CU depth division learning method is characterized by comprising the following steps of:

acquiring a video image, dividing the video image into a plurality of first CU blocks with the size of 64x64, and acquiring a frequency domain coefficient distribution matrix of DCT (discrete cosine transform) of the first CU blocks

According to the frequency domain coefficient distribution matrix->

Acquiring probability score->

wherein ,

k on the table indicates a kth first CU block among the plurality of first CU blocks;

if it is

Dividing the first CU block into 4 second CU blocks with the size of 32x32 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the second CU blocks +.>

According to the frequency domain coefficient distribution matrix->

Acquiring probability score->

Otherwise, ending the division of the first CU block; wherein (1)>

M above represents the mth m of the second CU divided downward by the first CU;

if it is

Dividing the second CU block into 4 third CU blocks with the size of 16x16 downwards to obtain a DCT transformed frequency domain coefficient distribution matrix of the third CU blocks>

According to the frequency domain coefficient distribution matrix->

Acquiring probability score->

Otherwise, ending the division of the second CU block; wherein (1)>

N on the upper represents the nth of the third CUs divided downward by the second CU;

if it is

Dividing the third CU block down into 4 fourth CU blocks of 8x8 size; otherwise, ending the division of the third CU block;

wherein ,α₆₄ 、α ₃₂ 、α ₁₆ All are dividing thresholds;

the probability score

Obtained by calculation by the following formula:

the probability score

Obtained by calculation by the following formula:

the probability score

Obtained by calculation by the following formula:

in the formula ,W₆₄ 、W ₃₂ 、W ₁₆ For a preset frequency domain distribution weight matrix, i represents row coordinates of the matrix, and j represents column coordinates of the matrix.

2. The CU depth partitioning method based on frequency domain distribution learning as set forth in claim 1, wherein the frequency domain distribution weight matrix W ₆₄ Obtained by:

acquiring a training set and a sample required by a network

Representing a DCT transformation frequency domain coefficient distribution matrix corresponding to a kth 64x64 size CU block; l (L) _k Indicating whether the kth 64x64 size CU block continues to partition downward, L _k The value 0 indicates no, L _k The value 1 indicates yes;

training the network according to a preset loss function to obtain a frequency domain distribution weight matrix W ₆₄ ；

The expression of the preset loss function is as follows:

where N represents the number of data in the training set, and N represents the size of the CU block.

3. The CU depth partitioning method based on frequency domain distribution learning as recited in claim 2, wherein the partitioning threshold α ₆₄ Obtained by:

select training set L _k Label sample of =0

Calculating probability scores from selected samples

From calculated probability scores

Acquiring the division threshold alpha ₆₄ ；

Division threshold

4. The method for frequency domain distribution based CU depth partitioning according to claim 1, wherein the partitioning the video image into a number of first CU blocks of 64x64 size comprises:

the video image is divided into first CU blocks of 64x64 size according to the luminance component.

5. The method for frequency domain distribution based CU depth partitioning according to claim 1, wherein the partitioning the video image into a number of first CU blocks of 64x64 size comprises:

after dividing the video image into a plurality of first CU blocks with 64x64 size, pixel interpolation is performed on the remaining pixel areas with the size of 64x 64.

6. A frequency domain distribution based learning CU depth partitioning system, comprising:

a first dividing module for obtaining video image, dividing the video image into a plurality of first CU blocks with 64x64 size, and obtaining DCT transformed frequency domain coefficient distribution matrix of the first CU blocks

According to the frequency domain coefficient distribution matrix->

Acquiring probability score->

wherein ,

a second dividing module for if

According to the frequency domain coefficient distribution matrix->

Acquiring probability score->

Otherwise, ending the division of the first CU block; wherein (1)>

M above represents the mth m of the second CU divided downward by the first CU;

a third dividing module for if

According to the frequency domain coefficient distribution matrix->

Acquiring probability score->

Otherwise, ending the division of the second CU block; wherein (1)>

a fourth dividing module forIf it is

wherein ,α₆₄ 、α ₃₂ 、α ₁₆ All are dividing thresholds;

the probability score

Obtained by calculation by the following formula:

the probability score

Obtained by calculation by the following formula:

the probability score

Obtained by calculation by the following formula:

7. A frequency domain distribution based learning CU depth partitioning apparatus, comprising:

at least one processor;

at least one memory for storing at least one program;

the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method of any one of claims 1-5.

8. A computer readable storage medium, in which a processor executable program is stored, characterized in that the processor executable program is for performing the method according to any of claims 1-5 when being executed by a processor.