CN112887712B

CN112887712B - HEVC intra-frame CTU partitioning method based on convolutional neural network

Info

Publication number: CN112887712B
Application number: CN202110147340.7A
Authority: CN
Inventors: 汪大勇; 徐太杰; 赵奕婷
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Guangzhou Dayu Chuangfu Technology Co ltd
Priority date: 2021-02-03
Filing date: 2021-02-03
Publication date: 2021-11-19
Anticipated expiration: 2041-02-03
Also published as: CN112887712A

Abstract

The invention belongs to the technical field related to video coding, and particularly relates to a convolutional neural network-based HEVC intra-frame CTU partitioning method, which comprises the following steps: processing a video frame by adopting a quad-tree neural network model to obtain all CTU quad-tree structures of the whole frame, coding and dividing all CTU quad-tree structures of a current frame by adopting an optimized coder, and dividing CTUs in an HEVC frame according to a dividing result; the method adopts the plurality of convolutional neural networks to optimize the quad-tree to form the quad-tree neural network model, all the CTUs in the HEVC frame can be divided through the model, the relevance of each division result is enhanced, the data processing efficiency is improved, and the coding time is reduced.

Description

A Convolutional Neural Network-based CTU Partitioning Method in HEVC Frame

技术领域technical field

本发明属于视频编码相关技术领域，具体涉及一种基于卷积神经网络的HEVC帧内CTU划分方法。The invention belongs to the related technical field of video coding, and in particular relates to a method for dividing CTUs in HEVC frames based on a convolutional neural network.

背景技术Background technique

随着视频逐渐地超高清化，且短视频、网络直播以及网络点播等新兴的视频应用方式的出现，对视频的存储与传输是一个巨大的挑战。因此，2013年，联合专家组发布了新一代高效视频编码标准(High Efficiency Video Coding，HEVC)，旨在对庞大的视频数据进行有效地压缩使其能在有限的带宽内进行存储和传输，其压缩率比上一代视频编码标准H.264/AVC提高一倍。在提高编码效率的同时，HEVC采用四叉树划分方式等更复杂的编码结构，急剧增加了编码的复杂度，严重影响了HEVC的实用性。With the gradual development of ultra-high-definition video, and the emergence of emerging video application methods such as short video, webcast, and web-on-demand, the storage and transmission of video is a huge challenge. Therefore, in 2013, the Joint Expert Group released a new generation of High Efficiency Video Coding (HEVC), which aims to effectively compress huge video data so that it can be stored and transmitted within a limited bandwidth. The compression rate is doubled compared to the previous generation video coding standard H.264/AVC. While improving the coding efficiency, HEVC adopts a more complex coding structure such as a quad-tree division method, which sharply increases the coding complexity and seriously affects the practicability of HEVC.

在对HEVC帧内预测的优化中，一般在CU层以及PU层中优化。HEVC帧内预测比较热门的快速算法有：利用平滑区域减少CU分割最大深度、基于纹理的减少PU预测方向、基于参考的减少CU遍历次数、根据四叉树分层结构减少预测模式数量、将预测方向分组从而进行快速模式选择、以及根据当前块周围像素的线性插值规律来预测当前块的预测模式。例如专利申请号为CN202010627907.6《视频编码方法及编码树单元划分方法、系统、设备及可读存储介质》的专利公开了一种编码树单元划分方法，该方法为：获取已确定划分情况的若干编码单元，将其分为训练集和测试集：训练和测试深度卷积神经网络，得到深度卷积神经网络预测模型，通过深度卷积神经网络预测模型对编码单元预测划分情况；至编码树单元内部所有的编码单元均完成预测，根据每个编码树单元内部所有的编码单元的预测结果，得到每个编码树单元的划分结果。本方法能同时进行的编码树单元划分线程和编码线程；进行至编码树单元划分阶段时，利用编码树单元划分线程共享的划分结果进行编码树单元划分，有效降低确定编码单元划分情况的复杂度，降低视频整体编码时间。In the optimization of HEVC intra prediction, optimization is generally performed in the CU layer and the PU layer. The popular fast algorithms for HEVC intra prediction include: reducing the maximum depth of CU segmentation by using smooth regions, reducing the PU prediction direction based on texture, reducing the number of CU traversals based on reference, reducing the number of prediction modes according to the quadtree hierarchical structure, and predicting The directions are grouped to perform fast mode selection and to predict the prediction mode of the current block according to the linear interpolation law of the surrounding pixels of the current block. For example, the patent application number CN202010627907.6 "Video coding method and coding tree unit division method, system, device and readable storage medium" discloses a coding tree unit division method. A number of coding units are divided into training sets and test sets: training and testing a deep convolutional neural network, obtaining a deep convolutional neural network prediction model, and predicting the division of coding units through the deep convolutional neural network prediction model; to the coding tree All coding units within the unit complete the prediction, and the division result of each coding tree unit is obtained according to the prediction results of all coding units within each coding tree unit. The method can perform the coding tree unit division thread and the coding thread at the same time; when the coding tree unit division stage is performed, the coding tree unit division is performed by using the division result shared by the coding tree unit division thread, which effectively reduces the complexity of determining the coding unit division. , which reduces the overall encoding time of the video.

但是该方法的卷积神经网络是独立的，使得整个方案只能确定当前的CU能不能划分，没有从整体上考虑最小化率失真值。However, the convolutional neural network of this method is independent, so that the whole scheme can only determine whether the current CU can be divided, without considering the minimum rate-distortion value as a whole.

发明内容SUMMARY OF THE INVENTION

为解决以上现有技术存在的问题，本发明提出了一种基于卷积神经网络的HEVC帧内CTU划分方法，该方法包括：In order to solve the above problems existing in the prior art, the present invention proposes a method for dividing CTUs in HEVC frames based on a convolutional neural network. The method includes:

S1：获取视频数据，将视频数据转化为视频帧；S1: Obtain video data and convert the video data into video frames;

S2：将视频帧输入到四杈树神经网络模型中，得到整帧所有的CTU四杈树结构；S2: Input the video frame into the quadruple tree neural network model to obtain all the CTU quadruple tree structures of the entire frame;

S3：采用优化的编码器读取当前帧的所有CTU四杈树的结构，根据当前CTU四杈树的结构对当前编码树单元CTU的编码单元CU进行划分；S3: adopt the optimized encoder to read the structure of all CTU quadtrees of the current frame, and divide the coding unit CU of the current coding tree unit CTU according to the structure of the current CTU quadruple tree;

S4：获取当前编码单元CU对应的划分结果，若CU的划分结果为0，则计算当前CU的RD-cost并且停止向下递归划分；若CU的划分结果为1，则跳过RD-cost的计算，并继续向下递归划分；若CU的划分结果不确定，则采用高斯判别器判断CU的划分结果，重复此过程直到停止递归或划分到最小CU尺寸为止；S4: Obtain the division result corresponding to the current coding unit CU. If the division result of the CU is 0, calculate the RD-cost of the current CU and stop recursive downward division; if the division result of the CU is 1, skip the RD-cost Calculate, and continue to recursively divide downward; if the division result of the CU is uncertain, the Gaussian discriminator is used to judge the division result of the CU, and this process is repeated until the recursion is stopped or the minimum CU size is divided;

S5：当CU的编码尺寸为8×8时，对预测单元PU进行划分；若当前PU划分结果为0，则当前PU的尺寸为8×8；若PU的划分结果为1时，将当前PU划分为4个4×4的PU，若划分结果不确定，则采用统计方法确定当前PU的尺寸。S5: When the coding size of the CU is 8×8, the prediction unit PU is divided; if the current PU division result is 0, the size of the current PU is 8×8; if the PU division result is 1, the current PU is divided It is divided into four 4×4 PUs. If the division result is uncertain, a statistical method is used to determine the size of the current PU.

优选的，得到CTU四杈树的结构的过程包括：Preferably, the process of obtaining the structure of the CTU quadruped tree includes:

S21：运行编码器HM，将视频帧数据输入到编码器HM中，计算视频帧数据的第n帧所有CTU的四叉树结构；S21: Run the encoder HM, input the video frame data into the encoder HM, and calculate the quad-tree structure of all CTUs of the nth frame of the video frame data;

S22：编码器将计算结果保存在文件中，其中以帧数给该文件命名，输出pred_over_n.t×t标志文件，其中n代表着第几帧；S22: The encoder saves the calculation result in a file, where the file is named by the number of frames, and outputs the pred_over_n.t×t flag file, where n represents the number of frames;

S23：当编码器HM检测到pred_over_n.t×t标志文件时，则从第n帧开始编码，当第n帧编码完成后检测下一帧的标志文件，继续编码，直到所有的帧数完成编码。S23: When the encoder HM detects the pred_over_n.t×t flag file, it starts encoding from the nth frame. After the nth frame is encoded, it detects the flag file of the next frame, and continues encoding until all frames are encoded. .

优选的，四杈树神经网络模型包括21个规模很小的子卷积神经网络。Preferably, the quadtree neural network model includes 21 small-scale sub-convolutional neural networks.

优选的，对编码单元CU进行划分的具体过程包括：Preferably, the specific process of dividing the coding unit CU includes:

S41：每个编码树单元CTU的尺寸为64×64，当划分结果为1，将CTU划分为四个32×32尺寸的编码单元CU；当划分结果为0，停止往下递归，计算当前CTU的RD-cost；若划分结果不确定，则计算CTU的偏度值和峰度值，若满足高斯分布则划分为四个32×32尺寸CU，反之停止往下递归；S41: The size of each coding tree unit CTU is 64×64. When the division result is 1, the CTU is divided into four coding unit CUs of 32×32 size; when the division result is 0, the recursion is stopped and the current CTU is calculated. If the division result is uncertain, calculate the skewness value and kurtosis value of the CTU, if it satisfies the Gaussian distribution, divide it into four 32×32 size CUs, otherwise stop recursing downwards;

S42：将深度为0的CTU划分为四个32×32尺寸的编码单元CU，编码器以Z字形顺序对四个32×32尺寸CU逐个编码，深度设置为1；从四叉树结构中获取每个32×32尺寸CU的划分结果，若当前CU的划分结果为1，则把当前CU划分为四个16×16尺寸CU；若当前CU的划分结果为0，则停止往下递归，计算当前32×32尺寸CU的RD-cost；对于不确定的划分结果，采用高斯判定来确定划分结果，若划分结果为1，把深度为0的CTU划分为四个32×32尺寸CU，为0则停止往下递归；S42: Divide the CTU with a depth of 0 into four coding unit CUs with a size of 32×32, the encoder encodes the four CUs with a size of 32×32 one by one in a zigzag sequence, and the depth is set to 1; obtained from the quad-tree structure For the division result of each 32×32 size CU, if the division result of the current CU is 1, then divide the current CU into four 16×16 size CUs; if the division result of the current CU is 0, stop recursing downward and calculate RD-cost of the current 32×32 size CU; for uncertain division results, Gaussian judgment is used to determine the division result, if the division result is 1, the CTU with a depth of 0 is divided into four 32×32 size CUs, which is 0 then stop recursing downwards;

S43：将深度为1的CU划分为四个16×16尺寸CU，编码器以“Z”字形顺序对四个16×16尺寸CU逐个编码，深度设置为2；从四叉树结构中获取每个16×16尺寸CU的划分结果，如果当前CU的划分结果为1，则把当前CU划分为四个8×8尺寸CU；若划分结果为0，则停止往下递归，计算当前16×16尺寸CU的RD-cost；对于不确定的划分结果由高斯判定来确定划分结果，若划分结果为1，把深度为2的CU划分为四个8×8尺寸CU，为0则停止往下递归；S43: Divide a CU with a depth of 1 into four 16×16-sized CUs, the encoder encodes the four 16×16-sized CUs one by one in the “Z” glyph order, and the depth is set to 2; obtain each CU from the quad-tree structure. If the division result of the current CU is 1, the current CU is divided into four 8×8 size CUs; if the division result is 0, the recursion is stopped and the current 16×16 is calculated. RD-cost of size CU; for uncertain division results, the division result is determined by Gaussian judgment. If the division result is 1, the CU with a depth of 2 is divided into four 8 × 8 size CUs, and if it is 0, the recursion is stopped. ;

S44：当编码到8×8尺寸的CU时，从四叉树结构中获得8×8尺寸预测单元PU的划分结果，若划分结果为1，当前PU划分为4个4×4尺寸PU；若划分结果为0则为8×8尺寸的PU；对于不确定的划分结果，则由统计方法来确定划分结果，若划分结果为1，则把当前的PU划分为四个4×4尺寸PU，为0则为8×8尺寸的PU。S44: When encoding to a 8×8-sized CU, obtain the division result of the 8×8-sized prediction unit PU from the quad-tree structure. If the division result is 1, the current PU is divided into four 4×4-sized PUs; if If the division result is 0, it is a PU of 8×8 size; for an uncertain division result, the division result is determined by a statistical method. If the division result is 1, the current PU is divided into four 4×4 size PUs. If it is 0, it is a PU of 8×8 size.

进一步的，计算RD-cost的过程包括：Further, the process of calculating RD-cost includes:

步骤1：计算SKIP模式下的率失真代价J_SKIP，SKIP模式为PU的一个跳过模式；Step 1: Calculate the rate-distortion cost J_SKIP in the SKIP mode, which is a skip mode of the PU;

步骤2：依次计算帧间预测时各种模式的率失真代价，找到这些模式中最小率失真代价J_inter；Step 2: Calculate the rate-distortion cost of various modes during inter-frame prediction in turn, and find the minimum rate-distortion cost J_inter in these modes;

步骤3：计算帧内预测时各种模式的率失真代价，找到这些模式中最小的率失真代价J_intra；Step 3: Calculate the rate-distortion cost of various modes during intra-frame prediction, and find the smallest rate-distortion cost J_intra among these modes;

步骤4：比较J_SKIP，J_inter，J_intra三者大小找到其中最小的率失真代价，即为RDCost。Step 4: Compare the sizes of J_SKIP, J_inter, and J_intra to find the smallest rate-distortion cost, which is RDCost.

进一步的，计算率失真代价的公式为：Further, the formula for calculating the rate-distortion cost is:

J＝B+λDJ=B+λD

优选的，采用高斯判别器对不确定的CU的划分结果的判断过程包括：Preferably, the judging process of using the Gaussian discriminator for the uncertain CU division result includes:

步骤1：计算编码单元CU的残差；Step 1: Calculate the residual of the coding unit CU;

步骤2：根据残差计算k阶样本中心矩B_K；Step 2: Calculate the central moment B _K of the k-order sample according to the residual;

步骤3：根据B_K计算残差的偏度估计G₁和峰度估计G₂；Step 3: Calculate residual skewness estimation G ₁ and kurtosis estimation G ₂ according to B _K ;

步骤4：由于偏度估计G₁和峰度估计G₂服从高斯分布，则将G₁和G₂转化为高斯分布的形式，并根据高斯分布计算残差偏度和峰度的均值和方差；Step 4: Since the skewness estimation G ₁ and the kurtosis estimation G ₂ obey the Gaussian distribution, G ₁ and G ₂ are converted into the form of the Gaussian distribution, and the mean and variance of the residual skewness and kurtosis are calculated according to the Gaussian distribution;

步骤5：设置临界值z_α；根据偏度估计G₁、峰度估计G₂的、均值、方差和临界值设置高斯判决条件；Step 5: set the critical value _zα ; set the Gaussian judgment condition according to the skewness estimation G ₁ , the kurtosis estimation G ₂ , the mean value, the variance and the critical value;

步骤6：若CU的偏度和峰度满足高斯判决条件，则对CU不划分，若不满足条件，则对CU进行划分。Step 6: If the skewness and kurtosis of the CU satisfy the Gaussian judgment condition, the CU is not divided, and if the condition is not met, the CU is divided.

进一步的，偏度估计G₁和峰度估计G₂的高斯表达式为：Further, the Gaussian expressions of skewness estimation G ₁ and kurtosis estimation G ₂ are:

进一步的，高斯分布条件μ₁和μ₂的计算公式为：Further, the calculation formulas of the Gaussian distribution conditions μ ₁ and μ ₂ are:

优选的，对预测单元PU进行统计划分的过程包括：Preferably, the process of statistically dividing the prediction unit PU includes:

步骤1：计算预测单元CU的

Step 1: Calculate the prediction unit CU

步骤2：计算先验检测阈值Q_step；Step 2: Calculate a priori detection threshold Q _step ;

步骤3：根据先验检测阈值判断

是否满足先验假设；若满足，则预测单元PU的尺寸为8×8；否则将测单元PU划分为4个4×4尺寸PU。Step 3: Judging according to the prior detection threshold

Whether the prior hypothesis is satisfied; if satisfied, the size of the prediction unit PU is 8×8; otherwise, the measurement unit PU is divided into 4 PUs of 4×4 size.

本发明采用多个卷积神经网络对四杈树进行优化处理，构成了四杈树神经网络模型，通过该模型，能够将所有的HEVC帧内CTU进行划分，并且使得每个划分结果的关联性加强，提高了处理数据的效率，降低了编码的时间。In the present invention, multiple convolutional neural networks are used to optimize the quadtree, and a quadruple tree neural network model is formed. Through this model, all the CTUs in the HEVC frame can be divided, and the correlation of each division result can be achieved. Strengthen, improve the efficiency of processing data, reduce the coding time.

附图说明Description of drawings

图1为本发明的四叉树结构预测的神经网络结构示意图；Fig. 1 is the neural network structure schematic diagram of quadtree structure prediction of the present invention;

图2为本发明的PU尺寸预测的神经网络结构图；Fig. 2 is the neural network structure diagram of PU size prediction of the present invention;

图3为本发明的编码器与预测器异步运行流程图；Fig. 3 is the asynchronous operation flow chart of encoder and predictor of the present invention;

图4为本发明的优化后HM编码器帧内预测CTU划分方法流程图。FIG. 4 is a flowchart of an optimized HM encoder intra-frame prediction CTU division method according to the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

本发明根据当前CTU的图像信息以及量化参数获得整个CTU的划分四叉树结构，对于概率值大于0.85视为1，小于0.15视为0,0.15到0.85之间的结果视为不确定的结果，不确定的结果则通过计算对应尺寸CU的偏度值和峰度值，然后判断其是否满足高斯分布来确定划分结果。8×8尺寸的PU划分方式相似。The present invention obtains the division quadtree structure of the entire CTU according to the image information and quantization parameters of the current CTU, and the probability value greater than 0.85 is regarded as 1, and the result less than 0.15 is regarded as 0, and the result between 0.15 and 0.85 is regarded as an uncertain result, For uncertain results, the division result is determined by calculating the skewness value and kurtosis value of the corresponding size CU, and then judging whether it satisfies the Gaussian distribution. The 8×8 size PU is divided in a similar way.

一种基于卷积神经网络的HEVC帧内CTU划分方法，该方法包括：A method for dividing a HEVC intra-frame CTU based on a convolutional neural network, the method comprising:

S2：将视频帧输入到异步运行的预测器和编码器中，得到整帧所有的CTU四杈树结构；S2: Input the video frame into the asynchronously running predictor and encoder to obtain all the CTU quadtree structures of the entire frame;

S4：获取当前编码单元CU对应的划分结果，若CU的划分结果为0，则计算当前CU的RD-cost，并停止向下递归划分；若CU的划分结果为1，则跳过RD-cost的计算，并继续向下递归划分；若CU的划分结果不确定，则采用高斯判别器判断CU的划分结果，重复此过程直到停止递归或划分到最小CU尺寸为止；S4: Obtain the division result corresponding to the current coding unit CU. If the division result of the CU is 0, calculate the RD-cost of the current CU, and stop recursive downward division; if the division result of the CU is 1, skip the RD-cost Calculation, and continue to recursively divide downward; if the division result of CU is uncertain, use Gaussian discriminator to judge the division result of CU, and repeat this process until the recursion is stopped or the minimum CU size is divided;

编码器程序的主体是HM(编码器)，用C++写成。在本发明中，CNN用Tensorflow实现，语言是Python，用两个程序间的通信，间接实现HM与深度神经网络的信息传递。编码过程和CNN的预测都是逐帧进行的，得到每个可能的CU的分割概率，再利用预测出的CU分割结果，简化RDO过程，实现复杂度优化。The main body of the encoder program is the HM (encoder), written in C++. In the present invention, the CNN is implemented by Tensorflow, the language is Python, and the communication between the two programs is used to indirectly realize the information transfer between the HM and the deep neural network. The encoding process and the prediction of CNN are carried out frame by frame, and the segmentation probability of each possible CU is obtained, and then the predicted CU segmentation result is used to simplify the RDO process and achieve complexity optimization.

如图1所示，为四叉树结构预测的神经网络结构示意图，Dense_1代表着四叉树的一个节点，它的预测值就代表着划分结果。预测的是整个CTU的划分结果即整个四叉树的结构。整个神经网络的优化就是对整个树进行优化。As shown in Figure 1, it is a schematic diagram of the neural network structure for quad-tree structure prediction. Dense_1 represents a node of the quad-tree, and its predicted value represents the division result. What is predicted is the division result of the entire CTU, that is, the structure of the entire quadtree. The optimization of the entire neural network is to optimize the entire tree.

如图3所示，得到CTU四杈树的结构的过程包括：As shown in Figure 3, the process of obtaining the structure of the CTU quadruple tree includes:

S21：运行编码器HM，将图像帧数据输入到编码器HM中，得到第n帧的所有CTU的四叉树结构；S21: run the encoder HM, input the image frame data into the encoder HM, and obtain the quad-tree structure of all CTUs of the nth frame;

S22：预测器计算第n帧所有CTU的四叉树结构，将计算结果保存在文件中，其中以帧数给该文件命名，输出pred_over_n.t×t标志文件，其中n代表着第几帧；S22: The predictor calculates the quad-tree structure of all CTUs in the nth frame, saves the calculation results in a file, where the file is named by the number of frames, and outputs the pred_over_n.t×t flag file, where n represents the number of frames;

S23：当编码器HM检测到pred_over_n.t×t标志文件时，则从第n帧开始编码，当第n帧编码完成后检测下一帧的标志文件，继续编码，直到所有的帧数都完成编码。S23: When the encoder HM detects the pred_over_n.t×t flag file, it starts encoding from the nth frame. After the nth frame is encoded, it detects the flag file of the next frame, and continues encoding until all frames are completed. coding.

编码器HM只有在检测到对应帧数的标志文件才可以进行读操作，以防止读取脏数据。The encoder HM can only perform the read operation when it detects the flag file corresponding to the number of frames to prevent reading dirty data.

结果文件中的每一行都是一棵四叉树的数组保存形式，HM构建一个二维数据就可以把一帧的所有CTU的四叉树划分结构保存。编码对应的CTU时只需获得对应的四叉树结构，就可以获得整个CTU的划分结果。Each line in the result file is an array storage form of a quadtree, and HM can save the quadtree division structure of all CTUs of a frame by constructing a two-dimensional data. When encoding the corresponding CTU, only the corresponding quad-tree structure needs to be obtained, and then the division result of the entire CTU can be obtained.

计算RD-cost的过程包括：计算CU大小选择和模式选择的比特数；计算图像失真度；相加获得RD-cost，RD-cost由以下公式定义。J＝B+λD，其中B是进行CU大小选择和模式选择的比特数，λ是指拉格朗日乘子，D指的是图像失真度。详细步骤(1)：计算SKIP模式下的率失真代价J_SKIP；(2)依次计算帧间预测时各种模式的率失真代价，找到这些模式中最小率失真代价，即为J_inter；(3)计算帧内预测时各种模式的率失真代价，找到这些模式中最小的率失真代价，即为J_intra；(4)比较J_SKIP，J_inter，J_intra三者大小找到其中最小的率失真代价，即为RDCost。The process of calculating the RD-cost includes: calculating the number of bits for CU size selection and mode selection; calculating the degree of image distortion; adding up to obtain the RD-cost, which is defined by the following formula. J=B+λD, where B is the number of bits for CU size selection and mode selection, λ refers to the Lagrange multiplier, and D refers to the degree of image distortion. Detailed steps (1): Calculate the rate-distortion cost J_SKIP in SKIP mode; (2) Calculate the rate-distortion cost of various modes in inter-frame prediction in turn, and find the minimum rate-distortion cost in these modes, which is J_inter; (3) Calculate The rate-distortion cost of various modes during intra-frame prediction, find the smallest rate-distortion cost in these modes, which is J_intra; (4) Compare the sizes of J_SKIP, J_inter, J_intra to find the smallest rate-distortion cost, which is RDCost.

如图1所示，四叉树结构预测的神经网络结构为：输入层、4个第一卷积层、4个第一池化层、16个第二卷积层、16个第二池化层、4个第一特征融合层、4个第一全局平均池化层、16个第一全连接层、16个第二特征融合层，16个第二连接层，16个第一输出层、5个第三特征融合层、5个第三卷积层，5个第三池化层、5个第四卷积层、5个第四池化层、5个第二全局平均池化层、5个第四特征融合层、5个第三全连接层、5个第五特征融合层、5个第四全连接层、5个第二输出层：第一卷积层为8通道，卷积核大小设置为3×3；第一池化层设置为Ma×pooling操作，池化窗口大小设置为2×2；第二卷积层为16通道，卷积核大小设置为3×3；第二池化层设置为Ma×pooling操作，池化窗口大小设置为2×2；第一特征融合层融合上一层之间的特征作为当前层的输出；第一全局平均池化层和第二全局平均池化层分别求出上一层每个特征图的均值作为当前层的输出；第一全连接层设置为32个神经元节点；第二特征融合层把上一层的32个神经元节点与量化参数融合在一起；第二全连接层设置为16个神经元节点；第一输出层的输出为[0,1]的实数，左边1-4个输出对应四叉树5-8节点的划分结果，左边5-8个输出对应四叉树9-12节点的划分结果，右边9-12个输出对应四叉树13-16节点的划分结果，右边12-16个输出对应四叉树17-20节点的划分结果；第三特征融合层融合4个第一池化层的特征图；第三卷积层为16通道，卷积核大小设置为3×3；第三池化层设置为Ma×pooling操作，池化窗口大小设置为3×3；第四卷积层为32通道，卷积核大小设置为3×3；第四池化层设置为Ma×pooling操作，池化窗口大小设置为3×3；第四特征融合层融合上层输出与量化参数；第三全连接层设置为32个神经元节点；第五特征融合层融合上层输出与量化参数；第四全连接层设置为16个神经元节点；第二输出层的输出为[0,1]的实数，左边4个输出为四叉树1-4节点的划分结果，右边一个输出为根节点的划分结果。As shown in Figure 1, the neural network structure predicted by the quadtree structure is: input layer, 4 first convolution layers, 4 first pooling layers, 16 second convolution layers, and 16 second pooling layers. layer, 4 first feature fusion layers, 4 first global average pooling layers, 16 first fully connected layers, 16 second feature fusion layers, 16 second connection layers, 16 first output layers, 5 third feature fusion layers, 5 third convolutional layers, 5 third pooling layers, 5 fourth convolutional layers, 5 fourth pooling layers, 5 second global average pooling layers, 5 fourth feature fusion layers, 5 third fully connected layers, 5 fifth feature fusion layers, 5 fourth fully connected layers, 5 second output layers: the first convolutional layer is 8 channels, the convolution The kernel size is set to 3 × 3; the first pooling layer is set to the Ma × pooling operation, the pooling window size is set to 2 × 2; the second convolution layer is 16 channels, and the convolution kernel size is set to 3 × 3; The second pooling layer is set to the Max×pooling operation, and the pooling window size is set to 2×2; the first feature fusion layer fuses the features between the previous layers as the output of the current layer; the first global average pooling layer and the second The global average pooling layer obtains the mean value of each feature map of the previous layer as the output of the current layer; the first fully connected layer is set to 32 neuron nodes; the second feature fusion layer uses the 32 neurons of the previous layer. The nodes and quantization parameters are fused together; the second fully connected layer is set to 16 neuron nodes; the output of the first output layer is a real number of [0,1], and the 1-4 outputs on the left correspond to the 5-8 nodes of the quadtree The division results of the left 5-8 outputs correspond to the division results of the 9-12 nodes of the quad tree, the right 9-12 outputs correspond to the division results of the 13-16 nodes of the quad tree, and the right 12-16 outputs correspond to the quad tree. The division results of 17-20 nodes; the third feature fusion layer fuses the feature maps of the four first pooling layers; the third convolution layer is 16 channels, and the convolution kernel size is set to 3 × 3; the third pooling layer is set For the Ma×pooling operation, the pooling window size is set to 3×3; the fourth convolutional layer is 32 channels, and the convolution kernel size is set to 3×3; the fourth pooling layer is set to the Ma×pooling operation, and the pooling window is The size is set to 3×3; the fourth feature fusion layer fuses the upper layer output and quantization parameters; the third fully connected layer is set to 32 neuron nodes; the fifth feature fusion layer fuses the upper layer output and quantization parameters; the fourth fully connected layer is set is 16 neuron nodes; the output of the second output layer is a real number of [0,1], the four outputs on the left are the division results of nodes 1-4 of the quadtree, and the one output on the right is the division result of the root node.

如图2所示，PU尺寸预测的神经网络结构图为：输入层、第一卷积层、第一池化层、第二卷积层、第一全局平均池化层、第一特征融合层、第一全连接层、第二全连接层、输出层：第一卷积层为16通道，卷积核大小设置为3×3；第一池化层设置为Ma×pooling操作，池化窗口大小设置为2×2；第二卷积层为24通道，卷积核大小设置为3×3；第一全局平均池化层分别求出上一层每个特征图的均值作为当前层的输出；第一特征融合层融合上一层的输出与量化参数；第一全连接层设置为256个神经元节点；第二全连接层设置为192个神经元节点；输出层的输出为[0,1]的实数。As shown in Figure 2, the neural network structure diagram of PU size prediction is: input layer, first convolution layer, first pooling layer, second convolution layer, first global average pooling layer, first feature fusion layer , the first fully connected layer, the second fully connected layer, and the output layer: the first convolution layer is 16 channels, and the convolution kernel size is set to 3 × 3; the first pooling layer is set to the Ma × pooling operation, the pooling window The size is set to 2 × 2; the second convolutional layer is 24 channels, and the size of the convolution kernel is set to 3 × 3; the first global average pooling layer calculates the mean of each feature map of the previous layer as the output of the current layer. ; The first feature fusion layer fuses the output and quantization parameters of the previous layer; the first fully connected layer is set to 256 neuron nodes; the second fully connected layer is set to 192 neuron nodes; the output of the output layer is [0, 1] is a real number.

对神经网络进行训练的过程包括：The process of training a neural network includes:

步骤1：获取训练卷积神经网络的训练集；Step 1: Obtain the training set for training the convolutional neural network;

在训练神经网络时需要大量的数据，通过收集其它视频序列让标准编码器对其编码，产生每个CTU划分结果，在把对应的划分结果转化为四叉树的数组形式作为数据集的标签，输入参数数据为64×64尺寸CTU对应Y分量的像素点和对应的量化参数，分别为22,27,32,37。训练集取总数据集的80％，测试集为总数据集的20％，验证集为官方提供的标准视频序列。PU的数据集也是用同理制作得到，通过训练设计好的PU尺寸预测模型就可以得到相应的神经网络模型。When training the neural network, a large amount of data is required. By collecting other video sequences and letting the standard encoder encode it, each CTU division result is generated, and the corresponding division result is converted into a quadtree array form as the label of the data set. The input parameter data is the pixel points of the 64×64 size CTU corresponding to the Y component and the corresponding quantization parameters, which are 22, 27, 32, and 37, respectively. The training set is 80% of the total data set, the test set is 20% of the total data set, and the validation set is the official standard video sequence. The PU data set is also produced in the same way, and the corresponding neural network model can be obtained by training the designed PU size prediction model.

步骤2：训练神经网络；Step 2: Train the neural network;

搭建四叉树预测神经网络，设置损失函数。由于预测四叉树结构的神经网络是由21个子分类器融合在一起形成的一个神经网络。21个子分类器分别对应了1个64×64，4个32×32，16个16×16的子节点。21一个分类器融合在一起，相互影响，每一个分类器的损失函数权重相同，则每一个分类器具有相同的重要性。总损失值最优化就代表着最优化整个四叉树结构。总损失函数由以下公式定义：Build a quadtree prediction neural network and set the loss function. Since the neural network for predicting the quadtree structure is a neural network formed by merging 21 sub-classifiers together. The 21 sub-classifiers correspond to 1 64×64, 4 32×32, and 16 16×16 child nodes respectively. 21 A classifier is fused together and affects each other. The weight of the loss function of each classifier is the same, and each classifier has the same importance. The optimization of the total loss value represents the optimization of the entire quadtree structure. The total loss function is defined by the following formula:

其中loss_i为每一个子分类器的损失函数，ω_i为每个子分类器的权重。loss_i定义如下：where loss _i is the loss function of each sub-classifier, and ω _i is the weight of each sub-classifier. loss _i is defined as follows:

loss_i＝-[y_i*ln(a_i)+(1-y_i)*ln(1-a_i)]loss _i =-[y _i *ln(a _i )+(1-y _i )*ln(1-a _i )]

其中y_i为真值，a_i为预测值。where y _i is the true value and a _i is the predicted value.

然后设置对应参数，搭建完神经网络就可以开始训练。Then set the corresponding parameters, and you can start training after building the neural network.

根据图2搭建PU尺寸预测神经网络，损失函数就为深度学习框架提供的2分类损失函数，设置对应参数，搭建完对应神经网络之后就可以开始训练。Build a PU size prediction neural network according to Figure 2. The loss function is the 2-category loss function provided by the deep learning framework, set the corresponding parameters, and start training after building the corresponding neural network.

对于CU不确定结果采用高斯判别来确定结果。一般情况下，残差系数通常采用高斯分布或者拉普拉斯分布来建模。最终采用哪个分布来建模需通过实验来确定。通过实验确定为高斯分布。主要研究偏度和峰度分布，检测是否服从高斯分布，服从高斯分布的编码块与划分结果之间的关系。For CU uncertain results, Gaussian discrimination is used to determine the results. In general, residual coefficients are usually modeled by Gaussian distribution or Laplace distribution. The final distribution to use for modeling needs to be determined experimentally. It is determined experimentally as a Gaussian distribution. It mainly studies the distribution of skewness and kurtosis, detects whether it obeys the Gaussian distribution, and the relationship between the coding blocks obeying the Gaussian distribution and the division results.

若x₁，x₂，…，x_n为残差系数，则偏度估计G₁和峰度估计G₂由以下公式定义：If x ₁ , x ₂ , . . . , x _n are residual coefficients, the skewness estimate G ₁ and the kurtosis estimate G ₂ are defined by the following formulas:

其中，G₁表示偏度估计，B_k表示第K阶样本中心距，G₂表示峰度估计。Among them, G ₁ represents the skewness estimation, B _k represents the K-th order sample center distance, and G ₂ represents the kurtosis estimation.

其中B_K由以下公式定义：where B _K is defined by the following formula:

其中

为x_i的均值，G₁和G₂通常服从高斯分布，所以可以写成以下高斯分布的形式：in

is the mean of x _i , G ₁ and G ₂ usually obey a Gaussian distribution, so it can be written in the form of the following Gaussian distribution:

其中，u₁表示偏度估计的均值，

表示偏度估计的平方差，n表示残差系数的个数，u₂表示峰度估计的均值，

表示峰度估计的平方差。where u ₁ represents the mean of the skewness estimate,

represents the squared difference of the skewness estimate, n represents the number of residual coefficients, u ₂ represents the mean of the kurtosis estimate,

Represents the squared difference of the kurtosis estimate.

根据以下公式就可以判别是否满足高斯分布：According to the following formula, it can be judged whether the Gaussian distribution is satisfied:

其中，μ₁表示G₁的统计假设检验值，G₁表示偏度估计，σ₁表示偏度估计的方差，μ₂表示G₂的统计假设检验值，G₂表示峰度估计，u₂表示峰度估计的均值，σ₂表示峰度估计的方差。where μ ₁ represents the statistical hypothesis test value of G ₁ , G ₁ represents the skewness estimate, σ ₁ represents the variance of the skewness estimate, μ ₂ represents the statistical hypothesis test value of G ₂ , G ₂ represents the kurtosis estimate, and u ₂ represents the The mean of the kurtosis estimate, and _σ2 represents the variance of the kurtosis estimate.

根据统计假设检验，显着性水平α表示错误地拒绝原假设的可能性，即分布是高斯分布。通过检查表高斯分布，我们可以获得相应的临界值测试值z_α。服从高斯分布的条件由以下公式判断：According to statistical hypothesis testing, the significance level α represents the probability of falsely rejecting the null hypothesis, i.e. the distribution is Gaussian. By examining the table Gaussian distribution, we can obtain the corresponding critical value test value z _α . The condition of obeying the Gaussian distribution is judged by the following formula:

|μ₁|≤z_α,|μ₂|≤z_α |μ ₁ |≤z _α ,|μ ₂ |≤z _α

如果满足以上两个条件，则残差系数被视为遵循高斯分布。在这种情况下，判断为不划分。获得最佳α和相应的z_α对于有效提高编码速度同时保持编码效率至关重要。不同深度下的z_α参数值是不同的，可以嵌入到编码器中通过实验获得最佳的参数值。如下表所示，为不同深度下的各个参数取值。If the above two conditions are met, the residual coefficients are considered to follow a Gaussian distribution. In this case, it is determined not to divide. Obtaining the optimal α and the corresponding z _α is crucial to effectively increase the coding speed while maintaining coding efficiency. The z _α parameter values at different depths are different and can be embedded in the encoder to obtain the best parameter values through experiments. As shown in the table below, take values for each parameter at different depths.

对于PU不确定结果采用统计的方式来确定划分结果。通过建模发现8×8尺寸PU的残差平方差之和小于某个值时判断为8×8尺寸，否则划分为4个4×4尺寸PU。For the uncertain result of the PU, a statistical method is used to determine the division result. Through modeling, it is found that the sum of the residual squared differences of the 8×8 size PU is less than a certain value, and it is judged as an 8×8 size, otherwise it is divided into four 4×4 size PUs.

对预测单元PU进行统计划分的过程包括：The process of statistically dividing the prediction unit PU includes:

步骤1：计算预测单元CU的

其中计算

的公式为：Step 1: Calculate the prediction unit CU

which calculates

The formula is:

其中，

表示残差平方差之和，OrgYuv表示原始像素点，pcPreYuv表示预测像素点。in,

Represents the sum of the residual squared differences, OrgYuv represents the original pixel, and pcPreYuv represents the predicted pixel.

步骤2：计算先验检测阈值Q_step。Q_step的计算公式为：Step 2: Calculate the prior detection threshold Q _step . The formula for calculating Q _step is:

其中，Q_step表示先验检测阈值，a_i表示量化参数和，QP分别表示量化参数，i取值为i＝QP％6，从而i∈[0,5]。Among them, Q _step represents a priori detection threshold, a _i represents the quantization parameter sum, QP represents the quantization parameter respectively, and i takes the value of i=QP%6, so i∈[0,5].

a_i的取值为：a₀＝0.625，a₁＝0.6875，a₂＝0.8125，a₃＝0.875，a₄＝1，a₅＝1.125。The values of a _i are: a ₀ =0.625, a ₁ =0.6875, a ₂ =0.8125, a ₃ =0.875, a ₄ =1, a ₅ =1.125.

步骤3：根据先验检测阈值判断

是否满足先验假设；若满足，则预测单元PU的尺寸为8×8；否则将测单元PU划分为4个4×4尺寸PU。判断

是否满足先验假设的具体公式为：Step 3: Judging according to the prior detection threshold

Whether the prior hypothesis is satisfied; if satisfied, the size of the prediction unit PU is 8×8; otherwise, the measurement unit PU is divided into 4 PUs of 4×4 size. judge

The specific formula for whether the prior hypothesis is satisfied is:

其中，SSD为残差平方差之和，β是超参。Among them, SSD is the sum of squared differences of residuals, and β is a hyperparameter.

视频序列尺寸Video sequence size β超参取值β hyperparameter value 416×240416×240 12.086912.0869 832×480832×480 13.356413.3564 1280×7201280×720 14.683214.6832 1920×10801920×1080 15.006215.0062

下表是Q_step的取值。The following table shows the value of Q _step .

如图4所示，优化后HM编码器帧内预测CTU划分的具体过程为：As shown in Figure 4, the specific process of the optimized HM encoder intra-frame prediction CTU division is:

读取结果文件中的四叉树数据结构，读取64×64节点的划分结果，大于0.85则划分为若是获得的划分结果为1则划分为四个32×32尺寸CU，跳过计算64×64尺寸CU的RD-cost，结果为0则停止往下递归，计算当前64×64尺寸CU的RD-cost，若划分结果不确定则计算64×64尺寸CU的偏度值和峰度值，若满足高斯分布则划分为四个32×32尺寸CU，反之停止往下递归。Read the quadtree data structure in the result file, read the division result of the 64×64 node, and if the division result is greater than 0.85, divide it into four 32×32 size CUs if the obtained division result is 1, and skip the calculation of 64× The RD-cost of a 64-size CU, if the result is 0, stop recursing downwards, and calculate the RD-cost of the current 64×64-size CU. If the division result is uncertain, calculate the skewness and kurtosis values of the 64×64-size CU. If the Gaussian distribution is satisfied, it is divided into four 32×32 size CUs, otherwise the recursion is stopped.

若深度为0的CTU划分为四个32×32尺寸CU，编码器将以Z字形顺序对四个32×32尺寸CU逐个编码，深度设置为1，从结果文件中获得每个32×32尺寸CU的划分结果，如果当前CU的划分结果为1则把当前CU划分为四个16×16尺寸CU，跳过计算32×32尺寸CU的RD-cost，为0则停止往下递归,计算当前32×32尺寸CU的RD-cost，同理对于不确定的划分结果由高斯判定来确定划分结果，若划分结果为1也把深度为0的CTU划分为四个32×32尺寸CU，为0则停止往下递归。If a CTU with a depth of 0 is divided into four 32×32 CUs, the encoder will encode the four 32×32 CUs one by one in a zigzag order, with the depth set to 1, and obtain each 32×32 CU from the result file. The division result of the CU, if the division result of the current CU is 1, the current CU is divided into four 16×16 size CUs, and the calculation of the RD-cost of the 32×32 size CU is skipped. The RD-cost of a 32×32 size CU is similarly determined by Gaussian judgment for uncertain division results. If the division result is 1, the CTU with a depth of 0 is also divided into four 32×32 size CUs, which is 0 Then stop recursing downwards.

若深度为1的CU划分为四个16×16尺寸CU，编码器以Z字形顺序对四个16×16尺寸CU逐个编码，深度设置为2，从结果文件中获得每个16×16尺寸CU的划分结果，如果当前CU的划分结果为1则把当前CU划分为四个8×8尺寸CU并且跳过计算16×16尺寸CU的RD-cost，为0则停止往下递归，计算当前16×16尺寸CU的RD-cost，同理对于不确定的划分结果由高斯判定来确定划分结果，若划分结果为1也把深度为2的CU划分为四个8×8尺寸CU，为0则停止往下递归。If a CU with a depth of 1 is divided into four 16×16 CUs, the encoder encodes the four 16×16 CUs one by one in zigzag order, and the depth is set to 2, and each 16×16 CU is obtained from the result file. If the division result of the current CU is 1, the current CU is divided into four 8×8 size CUs and the calculation of the RD-cost of the 16×16 size CU is skipped. If it is 0, the recursion is stopped and the current 16 size is calculated. RD-cost of ×16 size CU. Similarly, for uncertain division results, the division result is determined by Gaussian judgment. If the division result is 1, the CU with depth of 2 is divided into four 8×8 size CUs. If it is 0, then Stop recursing downwards.

当编码到8×8尺寸的CU时，从结果文件中获得8×8尺寸PU的划分结果，若划分结果为1当前PU划分为4个4×4尺寸PU，若为0则为8×8尺寸的PU，对于不确定的划分结果，则由统计方法来确定划分结果，若划分结果为1则把当前的PU划分为四个4×4尺寸PU，为0则为8×8尺寸的PU。When encoding to a CU of 8×8 size, the division result of 8×8 size PU is obtained from the result file. If the division result is 1, the current PU is divided into 4 4×4 size PUs, and if it is 0, it is 8×8 For a PU of size, for an uncertain division result, the division result is determined by a statistical method. If the division result is 1, the current PU is divided into four 4×4 size PUs, and if it is 0, it is an 8×8 size PU. .

以上所举实施例，对本发明的目的、技术方案和优点进行了进一步的详细说明，所应理解的是，以上所举实施例仅为本发明的优选实施方式而已，并不用以限制本发明，凡在本发明的精神和原则之内对本发明所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above-mentioned embodiments further describe the purpose, technical solutions and advantages of the present invention in detail. It should be understood that the above-mentioned embodiments are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modification, equivalent replacement, improvement, etc. made to the present invention within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. a method for dividing CTU in HEVC frame based on convolutional neural network, is characterized in that, comprises:

S1: Obtain video data and convert the video data into video frames;

S2: Input the video frame into the quadruple tree neural network model to obtain all the CTU quadruple tree structures of the entire frame;

S3: adopt the optimized encoder to read the structure of all CTU quadtrees of the current frame, and divide the coding unit CU of the current coding tree unit CTU according to the structure of the current CTU quadruple tree;

S4: Obtain the division result corresponding to the CU of the current coding unit. If the division result of the CU is 0, calculate the RD-cost of the current CU, and stop recursive downward division; if the division result of the CU is 1, skip the RD-cost Calculation, and continue to recursively divide downwards; if the division result of the CU is uncertain, the Gaussian discriminator is used to judge the division result of the CU, and this process is repeated until the recursion is stopped or the minimum CU size is divided;

The judging process of using the Gaussian discriminator to classify the uncertain CU includes:

Step 1: Calculate the residual of the coding unit CU;

Step 2: Calculate the central moment B _K of the k-order sample according to the residual;

Step 3: Calculate residual skewness estimation G ₁ and kurtosis estimation G ₂ according to B _K ;

Step 4: Since the skewness estimation G ₁ and the kurtosis estimation G ₂ obey the Gaussian distribution, G ₁ and G ₂ are converted into the form of the Gaussian distribution, and the mean and variance of the residual skewness and kurtosis are calculated according to the Gaussian distribution;

Step 5: set the critical value _zα ; set the Gaussian judgment condition according to the mean value, variance and critical value of skewness estimation G ₁ and kurtosis estimation G ₂ ;

Step 6: If the skewness and kurtosis of the CU satisfy the Gaussian judgment condition, the CU is not divided, and if the condition is not met, the CU is divided;

S5: When the coding size of the CU is 8×8, the prediction unit PU is divided; if the current PU division result is 0, the size of the current PU is 8×8; if the PU division result is 1, the current PU is divided It is divided into four 4×4 PUs. If the division result is uncertain, a statistical method is used to determine the size of the current PU.

2. a kind of HEVC intra-frame CTU dividing method based on convolutional neural network according to claim 1, is characterized in that, the process that obtains the structure of CTU quadruple tree comprises:

S21: Run the encoder HM, input the video frame data into the encoder HM, and calculate the quad-tree structure of all CTUs of the nth frame of the video frame data;

S22: The encoder saves the calculation result in a file, where the file is named by the number of frames, and outputs the pred_over_n.t×t flag file, where n represents the number of frames;

S23: When the encoder HM detects the pred_over_n.t×t flag file, it starts encoding from the nth frame. After the nth frame is encoded, it detects the flag file of the next frame, and continues encoding until all frames are encoded. .

3 . The method for dividing CTUs in HEVC frames based on a convolutional neural network according to claim 1 , wherein the four-branched tree neural network model comprises 21 small-scale sub-convolutional neural networks. 4 .

4. a kind of HEVC intra-frame CTU division method based on convolutional neural network according to claim 1, is characterized in that, the concrete process that coding unit CU is divided comprises:

S41: The size of each coding tree unit CTU is 64×64. When the division result is 1, the CTU is divided into four coding unit CUs with a size of 32×32; when the division result is 0, the downward recursion is stopped and the current CTU is calculated. If the division result is uncertain, calculate the skewness value and kurtosis value of the CTU, if it satisfies the Gaussian distribution, divide it into four 32×32 size CUs, otherwise stop recursing downwards;

S42: Divide the CTU with a depth of 0 into four coding unit CUs with a size of 32×32, the encoder encodes the four CUs with a size of 32×32 one by one in a zigzag sequence, and the depth is set to 1; obtained from the quad-tree structure For the division result of each 32×32 size CU, if the division result of the current CU is 1, the current CU is divided into four 16×16 size CUs; if the division result of the current CU is 0, the downward recursion is stopped and the calculation RD-cost of the current 32×32 size CU; for uncertain division results, Gaussian judgment is used to determine the division result, if the division result is 1, the CTU with a depth of 0 is divided into four 32×32 size CUs, which is 0 then stop recursing downwards;

S43: Divide a CU with a depth of 1 into four 16×16-sized CUs, the encoder encodes the four 16×16-sized CUs one by one in the “Z” glyph order, and the depth is set to 2; obtain each CU from the quad-tree structure. If the division result of the current CU is 1, the current CU is divided into four 8×8 size CUs; if the division result is 0, the recursion is stopped and the current 16×16 is calculated. RD-cost of size CU; for uncertain division results, the division result is determined by Gaussian judgment. If the division result is 1, the CU with a depth of 2 is divided into four 8×8 size CUs, and if it is 0, the downward recursion is stopped. ;

S44: When encoding to a 8×8-sized CU, obtain the division result of the 8×8-sized prediction unit PU from the quad-tree structure. If the division result is 1, the current PU is divided into four 4×4-sized PUs; if If the division result is 0, it is a PU of 8×8 size; for an uncertain division result, the division result is determined by a statistical method. If the division result is 1, the current PU is divided into four 4×4 size PUs. If it is 0, it is a PU of 8×8 size.

5. a kind of HEVC intra-frame CTU dividing method based on convolutional neural network according to claim 4, is characterized in that, the process of calculating RD-cost comprises:

Step 1: Calculate the rate-distortion cost J_SKIP in the SKIP mode, which is a skip mode of the PU;

Step 2: Calculate the rate-distortion cost of various modes during inter-frame prediction in turn, and find the minimum rate-distortion cost J_inter in the mode;

Step 3: Calculate the rate-distortion cost of various modes during intra-frame prediction, and find the smallest rate-distortion cost J_intra among these modes;

Step 4: Compare the sizes of J_SKIP, J_inter, and J_intra to find the smallest rate-distortion cost, which is RDCost.

6. a kind of HEVC intra-frame CTU dividing method based on convolutional neural network according to claim 5, is characterized in that, the formula of calculating rate-distortion cost is:

J=B+λD

Among them, B represents the number of bits for CU size selection and mode selection, λ represents the Lagrangian multiplier, and D represents the image distortion degree.

7. a kind of HEVC intra-frame CTU division method based on convolutional neural network according to claim 1 is characterized in that, the Gaussian expression of skewness estimation G ₁ and kurtosis estimation G ₂ is:

where u ₁ represents the mean of the skewness estimate,

Represents the squared difference of the kurtosis estimate.

8. a kind of HEVC intra-frame CTU dividing method based on convolutional neural network according to claim 1, is characterized in that, the calculation formula of Gaussian distribution condition μ ₁ and μ ₂ is:

where μ ₁ represents the statistical hypothesis test value of G ₁ , G ₁ represents the skewness estimate, σ ₁ represents the variance of the skewness estimate, μ ₂ represents the statistical hypothesis test value of G ₂ , G ₂ represents the kurtosis estimate, and u ₂ represents the The mean of the kurtosis estimate, and _σ2 represents the variance of the kurtosis estimate.

9. The method for dividing a HEVC intra-frame CTU based on a convolutional neural network according to claim 1, wherein the process of statistically dividing the prediction unit PU comprises:

Step 1: Calculate the sum of the residual squared differences of the prediction unit CU

Step 2: Calculate the prior detection threshold Qstep according to the quantization parameter and a _i ;

Step 3: Judging according to the prior detection threshold

Whether the a priori hypothesis is satisfied; if it is satisfied, the size of the prediction unit PU is 8×8; otherwise, the measurement unit PU is divided into four 4×4 size PUs;

in,

Represents the sum of the residual squared differences, Q _step represents the a priori detection threshold, a _i represents the sum of the quantization parameters, and QP represents the quantization parameters respectively.